M2: Network Analysis and Natural Language Processing

2018/2019

Forudsætninger/Anbefalede forudsætninger for at deltage i modulet

Completed course in applied statistics or similar.

Modulets indhold, forløb og pædagogik

Aim: M2 aims to give students insight into network and unstructured data types, as well as state-of-the-art approaches to map and analyse these data. Insights and techniques gained in this module will allow students to approach real-world problems in marketing (Who are the main influencers among our customers?), management (Can we identify new discourses in the communication within our organisation?), business economics (Can language patterns be used to understand R&D intensity across companies?), political science (How is a political candidate perceived by a certain demographic, based on their social network statements?), and sociology (How is a person’s behaviour and characteristics affected by their social network?).

 

Content:

With accelerating digitalisation of the modern world, we capture and store a growing amount of relational and unstructured (e.g. text) data. The former type of data encodes social, biological, physical and other complex systems as a collection of actual or potential relations between some entities. These can be users in an online social network, companies in a cluster, or research articles in a database linked via some association metric. Exploring such networks allows unveiling latent and general structural patterns, to understand how the interaction between elements reflects on their attributes, or how information flows through these systems. Indeed, envisioning and analysing complex systems such as national economies, natural ecosystems, or social interactions as networks have brought fresh wind to a broad range of academic disciplines and professional sectors alike. Working with relational data is not difficult, but it certainly requires some rethinking.

 

The other type of data, unstructured data, come in many varieties. The one that is arguably most attractive for social science analytics is text. Language encodes a vast range of meanings, entities, and relations. Natural language processing (NLP) has considerably advanced in the past years, making unstructured text suitable for machine learning.

 

The link between networks and unstructured data is given by the fact that unstructured data usually encode something that is closer to a depiction of reality than traditional structured data. Thus, it will typically contain information on some objects with their attributes as well as relational features linking the objects. Understanding the relational dimension is therefore essential to working with unstructured data.

Upon completion, students will have built a solid knowledge foundation within network theory and analysis, computational linguistics and broader (unstructured) data processing. The module is application-focused, and thus students will gain a variety of skills to utilise relational and unstructured text data for analysis purposes.

Læringsmål

Viden

Knowledge:

  • Show insights in the conceptual particularities and explanatory power of relational and network data.
  • Explain the interplay between network-theory concepts and real-world networks.
  • Understand the theoretical foundations, core-algorithms and metrics in network analysis.
  • Explain the concepts of multi-dimensional and multimodal networks and demonstrate comprehension of how they can be used for feature detection.
  • Describe main approaches to using network data in more general machine learning settings.
  • Explain main techniques used in data mining and structuration.
  • Explain central concepts within computational linguistics and methods in natural language processing.
  • Reflect upon the epistemology of language data.
  • Explain how language data is integrated into analytical frameworks.

Færdigheder

Skills:

  • Source, store and pre-process network and text data.
  • Calculate and interpret essential statistic metrics.
  • Integrate network indicators into machine learning pipelines.
  • Handle multiplex and multimodal networks.
  • Visualise networks and interaction pattern.
  • Perform grammar-based labelling and modifications on text data.
  • Perform tasks such as automated summarisation and sentiment analysis.
  • Extract entities from text.
  • Identify topics within large collections of documents.
  • Calculate semantic similarity.
  • Train and use word embedding models.

Kompetencer

Competencies:

  • Represent any real-life complex systems as networks.
  • Identify latent patterns, structures and interactions of entities in these systems.
  • Explore the interplay between the structure of systems and their performance as well as particular features and behaviour of individual entities.
  • Utilise natural language data for various types of mapping and analysis.

Undervisningsform

Lectures will be complemented by online resources and e-learning tools such as podcasting, online tutorials, and mini-assignments, as integral parts of the teaching methodology to enhance student engagement outside the classroom. Physical face-to-face time will be centred around the tacit and interactive components of the problem-solving processes.

Eksamen

Prøver

Prøvens navnModule 2: Network Analysis and Natural Language Processing
Prøveform
Skriftlig og mundtlig
Portfolio exam:
60% obtained through various graded (and supervised peer-graded) problem sheets and mini-assignments throughout the module.
40% final internal evaluation seminar with oral presentation, peer-evaluation (opponent group), internal critique and discussion departing from the final assignment and presentation.
ECTS5
Bedømmelsesform7-trins-skala
CensurIntern prøve
VurderingskriterierModule 2 is assessed according to the Danish 7-point grading scale. The grade 12 will be awarded to students who give an excellent performance and demonstrate that they have fulfilled the above objectives exhaustively or with few insignificant omissions. The grade 02 will be awarded to students who demonstrate that they have fulfilled the minimum acceptable level of the above learning objectives.

Fakta om modulet

Engelsk titelM2: Network Analysis and Natural Language Processing
ModulkodeKASDC20182
ModultypeKursus
Varighed1 semester
SemesterEfterår
ECTS5
UndervisningssprogEngelsk
UndervisningsstedCampus Aalborg
Modulansvarlig

Organisation

StudienævnStudienævnet for Samfundsøkonomi
FakultetDet Samfundsvidenskabelige Fakultet