Project lead tasks in the Team Data Science Process
This article describes tasks that a project lead completes to set up a repository for their project team in the Team Data Science Process (TDSP). The TDSP is a framework developed by Microsoft that provides a structured sequence of activities to efficiently execute cloud-based, predictive analytics solutions. The TDSP is designed to help improve collaboration and team learning. For an outline of the personnel roles and associated tasks for a data science team standardizing on the TDSP, see Team Data Science Process roles and tasks.
A project lead manages the daily activities of individual data scientists on a specific data science project in the TDSP. The following diagram shows the workflow for project lead tasks:
This tutorial covers Step 1: Create project repository, and Step 2: Seed project repository from your team ProjectTemplate repository.
For Step 3: Create Feature work item for project, and Step 4: Add Stories for project phases, see Agile development of data science projects.
For Step 5: Create and customize storage/analysis assets and share, if necessary, see Create team data and analytics resources.
For Step 6: Set up security control of project repository, see Add team members and configure permissions.
This article uses Azure Repos to set up a TDSP project, because that is how to implement TDSP at Microsoft. If your team uses another code hosting platform, the project lead tasks are the same, but the way to complete them may be different.
- The Azure DevOps organization for your data unit
- A team project for your data science team
- Team template and utilities repositories
- Permissions on your organization account for you to create and edit repositories for your project
To clone repositories and modify content on your local machine or Data Science Virtual Machine (DSVM), or set up Azure file storage and mount it to your DSVM, you also need to consider this checklist:
- An Azure subscription.
- Git installed on your machine. If you're using a DSVM, Git is pre-installed. Otherwise, see the Platforms and tools appendix.
- If you want to use a DSVM, the Windows or Linux DSVM created and configured in Azure. For more information and instructions, see the Data Science Virtual Machine Documentation.
- For a Windows DSVM, Git Credential Manager (GCM) installed on your machine. In the README.md file, scroll down to the Download and Install section and select the latest installer. Download the .exe installer from the installer page and run it.
- For a Linux DSVM, an SSH public key set up on your DSVM and added in Azure DevOps. For more information and instructions, see the Create SSH public key section in the Platforms and tools appendix.
Create a project repository in your team project
To create a project repository in your team's MyTeam project:
Go to your team's project Summary page at https://<server name>/<organization name>/<team name>, for example, https://dev.azure.com/DataScienceUnit/MyTeam, and select Repos from the left navigation.
Select the repository name at the top of the page, and then select New repository from the dropdown.
In the Create a new repository dialog, make sure Git is selected under Type. Enter DSProject1 under Repository name, and then select Create.
Confirm that you can see the new DSProject1 repository on your project settings page.
Import the team template into your project repository
To populate your project repository with the contents of your team template repository:
From your team's project Summary page, select Repos in the left navigation.
Select the repository name at the top of the page, and select DSProject1 from the dropdown.
On the DSProject1 is empty page, select Import.
In the Import a Git repository dialog, select Git as the Source type, and enter the URL for your TeamTemplate repository under Clone URL. The URL is https://<server name>/<organization name>/<team name>/_git/<team template repository name>. For example: https://dev.azure.com/DataScienceUnit/MyTeam/_git/TeamTemplate.
Select Import. The contents of your team template repository are imported into your project repository.
If you need to customize the contents of your project repository to meet your project's specific needs, you can add, delete, or modify repository files and folders. You can work directly in Azure Repos, or clone the repository to your local machine or DSVM, make changes, and commit and push your updates to the shared project repository. Follow the instructions at Customize the contents of the team repositories.
This article is maintained by Microsoft. It was originally written by the following contributors.
- Mark Tabladillo | Senior Cloud Solution Architect
To see non-public LinkedIn profiles, sign in to LinkedIn.
Here are links to detailed descriptions of the other roles and tasks defined by the Team Data Science Process: