Important

All Microsoft Academic Services have been officially retired as of December 31st, 2021. We are currently retaining original documentation as-is for educational use, however all information about signing up for services is no longer valid, and support and service (API) links will not function.


Language Similarity Package

The Microsoft Academic Language Similarity Package provides supplementary processing functionality for use with the Microsoft Academic Graph (MAG). This package includes Language Similarity API and required resources. This API provides functionality for:

  • Similarity comparison between input texts using pre-trained word embeddings which are trained on the MAG corpus, and
  • Labeling text with fields of study defined in MAG.

Prerequisites

Before running these examples, you need to complete the following setups:

System Requirements

  • Microsoft Windows 7 (or above) 64-bit OS
  • .NET Framework version 4.5.2+
  • Visual Studio 2015 (or above)

Contents

The Language Similarity package is distributed as a single zip file. It is located at nlp\LanguageSimilarity.zip in the MAG container.

LanguageSimilarity.zip

It includes algorithms in dlls and resources with pre‑trained models. After unzipping the package, users will see a folder structure as shown in the figure below. README files contain general information about the package, system requirements, and API signatures.

Language Similarity Package content

We also include a C# demo project in the LanguageSimilarityExample folder. It contains sample.txt as input for the demo project. The demo project is a console program which takes resource directory and the sample.txt path as paremeters. The resource directory is to initialize the language similarity models, while sample.txt is used to provide prarmeters for calling methods in this package.

Constructor

Methods

Sample

Resource