第二階段 - 源環境評估

已完成

遷移前評估至關重要。 此階段可讓您完整瞭解您的 Azure DevOps 環境,包括存放庫歷程記錄、管線類型和 Azure Boards 自定義。

為什麼來源評估很重要

在工程、安全和領導層之間建立共同的事實,可以阻止範圍漂移和意外。 精確的庫存可讓您自信地談論數量、規模和複雜性。 您將知道複雜的存放庫和管道在哪裡,以及為什麼需要特別注意。

安裝和設定 ADO2GH 延伸模組

GitHub CLI ADO2GH 延伸模組提供專為 Azure DevOps 移轉至 GitHub 所設計的命令。

首先,安裝擴充功能:

gh extension install github/gh-ado2gh

向 GitHub 進行驗證,並匯出您的 Azure DevOps 個人存取權杖:

gh auth login --hostname github.com
export ADO_PAT="your-ado-pat-token"

產生全面的庫存報告

清查報告命令會建立 CSV 檔案,其中包含 Azure DevOps 環境的詳細資訊。

產生特定 Azure DevOps 組織的清單:

gh ado2gh inventory-report --ado-org "your-ado-org"

如需僅使用基本資訊進行更快的評估:

gh ado2gh inventory-report --ado-org "your-ado-org" --minimal

若要評估所有可存取的組織 (省略參數):--ado-org

gh ado2gh inventory-report

此命令會產生數個 CSV 檔案:

  • organizations.csv - Azure DevOps 組織及其設定
  • team_projects.csv - 每個組織內的團隊專案
  • repositories.csv - 儲存庫詳細資料,包括大小、類型和活動
  • pipelines.csv - 建置和發行管線設定

分析存放庫特性

每個儲存庫都有影響移轉複雜性和方法的獨特特性。

庫存分析

重點關注以下關鍵領域:

  • 總計數和大小分佈 - 分類為 100MB、100MB–1GB、<1GB 等>頻段
  • 存放庫類型 - Git 存放庫與 Team Foundation 版本控制 (TFVC)
  • 最近的活動 — 識別主動開發的存放庫與封存的存放庫

使用命令列工具快速分析 CSV 資料:

# Count repositories per team project
cut -d',' -f2 repositories.csv | sort | uniq -c | sort -nr

# Get total pipeline count
wc -l < pipelines.csv

內容複雜度指標

尋找這些增加移轉複雜性的因素:

  • 具有廣泛標籤歷史記錄的大型儲存庫 (>1GB)
  • 需要協調移轉順序的 Git 子模組
  • 需要 Git LFS 轉換的存放庫歷程記錄中的二進位成品
  • 含有保護規則的多個活動分支
  • 廣泛的發行標籤策略

風險解讀

了解這些模式有助於設定切合實際的期望:

  • 超過 1GB 且具有許多標籤的儲存庫通常需要較長的移轉時間和潛在的清除
  • 子模組需要仔細協調,以維持適當的固定關聯性
  • 歷程記錄中的二進位成品可能需要 Git LFS 移轉或清除工作

對應 Azure Boards 相依性

Azure Boards 分析可協助您瞭解工作追蹤相依性,並選擇正確的移轉策略。

記錄以下關鍵方面:

### Azure Boards Usage Analysis
Active Projects: [List projects actively using Boards]
Custom Work Item Types: [Document custom types beyond standard]
Process Templates: [Note Agile, Scrum, CMMI, or custom processes]
Custom Fields: [List organization-specific fields]
Workflow Customizations: [Document state transitions and rules]
Integration Points: [Repositories with AB# work item links]

戰略決策框架

選擇您的 Azure Boards 方法:

  • 保留 - 當時間緊迫且工作追蹤流程被大量客製化時
  • 移轉 - 當工作專案可以匯出並匯入至 GitHub 問題或專案時
  • 混合式 - 當您需要工作專案連結,但想要逐步轉換時

GitHub Copilot 輔助遷移分析

為了簡化和標準化遷移評估,團隊現在使用基於 Copilot 的遷移分析提示,以自動執行存儲庫評分、波浪分類和時間表生成。 此提示會實作與 GitHub Enterprise Importer 移轉策略相同的邏輯和結構,包括:

  • 存放庫評分 (大小、活動、簡單性)
  • 遷移波次分配 (1-3 和手動審查)
  • 時間表預測風險和資格分析
  • 在 Markdown 中產生摘要報告

下面包含用於此過程的完整提示以供參考。

# GitHub Enterprise Migration Analysis Prompt

## System Instructions

You are an expert migration analyst specializing in GitHub Enterprise migrations. Your task is to analyze repository data from a SQLite database or CSV file and generate a comprehensive migration strategy document similar to a GitHub Enterprise Importer Migration Strategy.

## Input Data Requirements

The input data should contain repository information with the following key fields:
- Repository identification (org, teamproject, repo, url)
- Size metrics (compressed-repo-size-in-bytes)
- Activity metrics (last-push-date, commits-past-year, pr-count)
- Development metrics (pipeline-count, most-active-contributor)
- Note: This is Azure DevOps migration data, so some GitHub-specific fields are not available

## Analysis Framework

### 1. Repository Scoring System
Implement a Migration Compatibility Score (0-11 points) based on:

Size Score (0-5 Points) (based on compressed-repo-size-in-bytes):
- 5 points: < 10MB (Excellent - Fast migration)
- 4 points: < 100MB (Very Good - Standard migration)  
- 3 points: < 500MB (Good - Moderate migration time)
- 2 points: < 5GB (Fair - Longer migration)
- 1 point: < 30GB (Poor - Extended migration, near project limit)
- 0 points: ≥ 30GB (Risk - Exceeds project limit, requires special handling)

Activity Score (0-3 Points) (based on commits-past-year):
- 3 points: 0 commits (Inactive - Lowest migration disruption)
- 2 points: 1-10 commits (Low activity - Minimal disruption)
- 1 point: 11-100 commits (Moderate activity - Some disruption risk)
- 0 points: > 100 commits (High activity - Significant disruption risk)

Simplicity Score (0-3 Points) (based on pr-count and pipeline-count):
- 3 points: 0 PRs AND ≤ 1 pipeline (Simple)
- 2 points: ≤ 10 PRs AND ≤ 3 pipelines (Moderate)
- 1 point: ≤ 50 PRs AND ≤ 10 pipelines (Complex)
- 0 points: > 50 PRs OR > 10 pipelines (Very Complex)

### 2. Migration Wave Classification

Eligibility Criteria (must meet ALL to be eligible for automated waves):
- compressed-repo-size-in-bytes < 30GB
- pipeline-count ≤ 10 (manageable CI/CD complexity)
- Recent activity (last-push-date within last 1 year)
- Note: Azure DevOps data doesn't include fork/archived status - assume all are eligible unless size/complexity exceeds limits

Wave Assignment:
- Wave 1 (Score 9-11): Low Risk - Small, simple, low-activity repositories
- Wave 2 (Score 6-8): Medium Risk - Standard repositories with moderate complexity
- Wave 3 (Score 3-5): High Risk - Complex or large repositories requiring planning
- Manual Review (Score 0-2 OR fails eligibility): Very High Risk - Individual assessment required

### 3. Timeline Calculation

Migration Rates by Wave:
- Wave 1: 800 repositories/day (100 repos/hour) - 8-hour workday
- Wave 2: 600 repositories/day (75 repos/hour) - 8-hour workday
- Wave 3: 400 repositories/day (50 repos/hour) - 8-hour workday
- Manual Review: 200 repositories/day (manual processing)

Calculate timeline in business days (22 days/month)

## Required Output Document Structure

Generate a comprehensive markdown document with the following sections:

### 1. Executive Summary
- Total repositories analyzed
- Automated migration eligible count and percentage
- Manual review required count and percentage
- Estimated timeline in months
- Peak daily throughput
- Migration scale overview with wave distribution

### 2. Repository Selection Criteria
- Repository-level requirements table
- GitHub Enterprise Importer specific considerations
- Activity-based prioritization strategy explanation
- Benefits of low-activity first approach

### 3. Repository Scoring System
- Detailed scoring methodology
- Risk assessment matrix table
- Score range to risk level mapping

### 4. Migration Wave Strategy
#### For each wave (Wave 1, 2, 3, Manual Review):
- Repository count
- Timeline calculation
- Batch size and migration rate
- Characteristics description
- Success criteria
- Special considerations (especially for 30GB limits in Wave 3)

#### Total Migration Timeline Summary:
- Wave breakdown table with counts, percentages, timelines
- Key insights about distribution
- Impact analysis of project constraints

### 5. Repository Size Analysis
#### Top 10 Largest Repositories Table:
- Repository name (org/teamproject/repo)
- Size in GB (converted from compressed-repo-size-in-bytes)
- Last push date
- Commits past year
- PR count
- Pipeline count
- Most active contributor
- Migration impact assessment

#### Critical Size Observations:
- Repositories exceeding 30GB limit
- Near-limit repositories (25-30GB)
- Size optimization recommendations

### 6. GitHub Enterprise Importer Limitations and Considerations
#### Capabilities Table:
- What data IS migrated (based on GHES version)
- Technical limitations and size limits
- Data that is NOT migrated (comprehensive list)
- Branch protection limitations
- Migration process limitations

## Data Analysis Instructions

1. Calculate Migration Scores: Apply the scoring system to all repositories
2. Determine Eligibility: Check each repository against eligibility criteria
3. Assign Migration Waves: Based on eligibility and scores
4. Generate Statistics: Calculate counts, percentages, and timelines for each wave
5. Identify Outliers: Find largest repositories, most complex repositories, edge cases
6. Risk Assessment: Identify repositories requiring special attention
7. Timeline Projection: Calculate realistic migration timeline based on complexity

## Output Requirements

- Generate professional markdown document
- Include all statistical analysis with specific numbers
- Provide actionable recommendations
- Include tables with proper formatting
- Calculate realistic timelines
- Identify risks and mitigation strategies
- Reference GitHub Enterprise Importer documentation limitations
- Use consistent terminology throughout

## Data Source

Analyze the provided SQLite database or CSV file containing repository data with the schema and views described above. Use the `migration_scoring` view for wave assignments and the `manual_review_repositories` view for detailed manual review analysis.

Key analysis patterns for CSV data:

Available fields in repos.csv:
- org: Organization name
- teamproject: Team project name  
- repo: Repository name
- url: Azure DevOps repository URL
- last-push-date: Last activity date
- pipeline-count: Number of CI/CD pipelines
- compressed-repo-size-in-bytes: Repository size in bytes
- most-active-contributor: Primary contributor
- pr-count: Number of pull requests
- commits-past-year: Commit activity in past year
- Size conversion: compressed-repo-size-in-bytes / (1024^3) for GB
- Activity assessment: commits-past-year and last-push-date
- Complexity assessment: pipeline-count and pr-count
- Repository identification: org/teamproject/repo combination

Generate the complete migration strategy document based on this analysis framework and the provided repository data.

對應 Azure Pipelines 相依性

Azure Pipelines 分析會顯示建置和部署自動化複雜度。

編目這些管線元素:

### Azure Pipelines Usage Analysis
Classic Build Pipelines: [Count and complexity]
Classic Release Pipelines: [Count and deployment stages]
YAML Pipelines: [Count and modern practices adoption]
Service Connections: [External service integrations]
Variable Groups: [Shared configuration and secrets]
Secure Files: [Certificates and configuration files]
Agent Pools: [Self-hosted vs. Microsoft-hosted usage]

管線移轉策略

  • Keep - 複雜的傳統工作流程,可能需要大量時間來重寫
  • 移轉:可適應 GitHub Actions 的 YAML 型管線
  • 混合式 - 在轉換期間將程式碼裝載與發行協調流程分離

了解這些相依性有助於您向利害關係人傳達現實的時間表和資源需求。