M2: Network Analysis and Natural Language Processing

2020/2021

Content, progress and pedagogy of the module

Aim: M2 aims to give students insight into network and unstructured data types, as well as state-of-the-art approaches to map and analyse these data. Insights and techniques gained in this module will allow students to approach real-world problems in marketing (Who are the main influencers among our customers?), management (Can we identify new discourses in the communication within our organisation?), business economics (Can language patterns be used to understand R&D intensity across companies?), political science (How is a political candidate perceived by a certain demographic, based on their social network statements?), and sociology (How is a person’s behaviour and characteristics affected by their social network?).

 

Content:

With accelerating digitalisation of the modern world, we capture and store a growing amount of relational and unstructured (e.g. text) data. The former type of data encodes social, biological, physical and other complex systems as a collection of actual or potential relations between some entities. These can be users in an online social network, companies in a cluster, or research articles in a database linked via some association metric. Exploring such networks allows unveiling latent and general structural patterns, to understand how the interaction between elements reflects on their attributes, or how information flows through these systems. Indeed, envisioning and analysing complex systems such as national economies, natural ecosystems, or social interactions as networks have brought fresh wind to a broad range of academic disciplines and professional sectors alike. Working with relational data is not difficult, but it certainly requires some rethinking.

 

The other type of data, unstructured data, come in many varieties. The one that is arguably most attractive for social science analytics is text. Language encodes a vast range of meanings, entities, and relations. Natural language processing (NLP) has considerably advanced in the past years, making unstructured text suitable for machine learning.

 

The link between networks and unstructured data is given by the fact that unstructured data usually encode something that is closer to a depiction of reality than traditional structured data. Thus, it will typically contain information on some objects with their attributes as well as relational features linking the objects. Understanding the relational dimension is therefore essential to working with unstructured data.

Upon completion, students will have built a solid knowledge foundation within network theory and analysis, computational linguistics and broader (unstructured) data processing. The module is application-focused, and thus students will gain a variety of skills to utilise relational and unstructured text data for analysis purposes.

Learning objectives

Knowledge

Knowledge:

  • Show insights in the conceptual particularities and explanatory power of relational and network data.
  • Explain the interplay between network-theory concepts and real-world networks.
  • Understand the theoretical foundations, core-algorithms and metrics in network analysis.
  • Explain the concepts of multi-dimensional and multimodal networks and demonstrate comprehension of how they can be used for feature detection.
  • Describe main approaches to using network data in more general machine learning settings.
  • Explain main techniques used in data mining and structuration.
  • Explain central concepts within computational linguistics and methods in natural language processing.
  • Reflect upon the epistemology of language data.
  • Explain how language data is integrated into analytical frameworks.

Skills

Skills:

  • Source, store and pre-process network and text data.
  • Calculate and interpret essential statistic metrics.
  • Integrate network indicators into machine learning pipelines.
  • Handle multiplex and multimodal networks.
  • Visualise networks and interaction pattern.
  • Perform grammar-based labelling and modifications on text data.
  • Perform tasks such as automated summarisation and sentiment analysis.
  • Extract entities from text.
  • Identify topics within large collections of documents.
  • Calculate semantic similarity.
  • Train and use word embedding models.

Competences

Competencies:

  • Represent any real-life complex systems as networks.
  • Identify latent patterns, structures and interactions of entities in these systems.
  • Explore the interplay between the structure of systems and their performance as well as particular features and behaviour of individual entities.
  • Utilise natural language data for various types of mapping and analysis.

Type of instruction

Lectures will be complemented by online resources and e-learning tools such as podcasting, online tutorials, and mini-assignments, as integral parts of the teaching methodology to enhance student engagement outside the classroom. Physical face-to-face time will be centred around the tacit and interactive components of the problem-solving processes.

Exam

Prerequisite for enrollment for the exam

  • A prerequisite for participating in the exam is that the student has participated actively in developing written material during the module.

Exams

Name of examModule 2: Network Analysis and Natural Language Processing
Type of exam
Oral exam
Group examination with max. 6 students.
ECTS5
Assessment7-point grading scale
Type of gradingInternal examination
Criteria of assessmentThe criteria of assessment are stated in the Examination Policies and Procedures

Facts about the module

Danish titleM2: Network Analysis and Natural Language Processing
Module codeKAØKO202019
Module typeCourse
Duration1 semester
SemesterAutumn
ECTS5
Language of instructionEnglish
Location of the lectureCampus Aalborg
Responsible for the module

Organisation

Study BoardStudy Board of Economics
DepartmentAAU Business School
FacultyThe Faculty of Social Sciences