tcga-downloader/CONTRIBUTING.md
yunpeng.zhang a01a59b371
Some checks failed
CI / Lint (push) Failing after 9m32s
CI / Test (3.11) (push) Successful in 6m41s
CI / Test (3.12) (push) Successful in 4m21s
feat: add interactive cli
2026-02-09 13:13:39 +08:00

4.3 KiB

Contributing to TCGA Downloader

Thank you for your interest in contributing! This document provides guidelines and instructions for contributing to the project.

Getting Started

Prerequisites

  • Python 3.11 or higher
  • pip or uv
  • gdc-client (for testing downloads)
  • Git

Setting Up Development Environment

  1. Fork the repository
  2. Clone your fork:
git clone <your-fork-url>
cd tcga-downloader
  1. Install the package in development mode:
pip install -e ".[dev]"
  1. Install and setup pre-commit hooks:
pip install pre-commit
pre-commit install

Development Workflow

Branch Strategy

  • main: Production-ready code
  • Create feature branches from main: feature/your-feature-name
  • Use descriptive branch names

Making Changes

  1. Create a new branch for your feature:
git checkout -b feature/your-feature-name
  1. Make your changes following the coding standards (see below)

  2. Run tests and linting:

pytest
black .
ruff check --fix .
mypy tcga_downloader
  1. Commit your changes with descriptive messages:
git add .
git commit -m "feat: add new filtering option for sample type"
  1. Push to your fork:
git push origin feature/your-feature-name
  1. Create a Pull Request

Coding Standards

Code Style

This project uses:

  • Black: Code formatting (line length: 100)
  • Ruff: Linting
  • mypy: Type checking

Before committing, run:

black .
ruff check --fix .
mypy tcga_downloader

Or use pre-commit hooks which run automatically:

pre-commit run --all-files

Type Hints

All public functions should include type hints. Use standard library types:

from typing import List, Optional

def query_files(
    project: str,
    data_type: str,
    fields: Optional[List[str]] = None,
) -> List[dict]:
    ...

Logging

Use the logging module instead of print statements:

from tcga_downloader.logger import get_logger

logger = get_logger("module_name")
logger.info("Processing file: %s", filename)
logger.debug("Detailed debug information")

Error Handling

Use descriptive error messages and appropriate exception types:

if not manifest_path.exists():
    raise FileNotFoundError(f"Manifest file not found: {manifest_path}")

Testing

Running Tests

pytest

With coverage:

pytest --cov=tcga_downloader --cov-report=html --cov-report=term

Writing Tests

  • Place tests in the tests/ directory
  • Name test files: test_<module>.py
  • Name test functions: test_<function>_<scenario>

Example:

def test_manifest_roundtrip_tsv(tmp_path):
    records = [ManifestRecord(...)]
    path = tmp_path / "m.tsv"
    write_manifest(records, path, fmt="tsv")
    loaded = load_manifest(path)
    assert loaded == records

Test Coverage

Aim for high test coverage. New features should include tests.

Documentation

Code Documentation

Use docstrings for public APIs:

def build_filters(project: str, data_type: str) -> dict:
    """Build GDC API filters for querying files.

    Args:
        project: TCGA project ID (e.g., "TCGA-BRCA")
        data_type: Data type to filter (e.g., "Gene Expression")

    Returns:
        Filter dictionary for GDC API query
    """
    ...

Documentation Updates

When adding new features:

  1. Update README.md with usage examples
  2. Add type hints
  3. Update relevant sections in documentation

Pull Request Process

PR Checklist

Before submitting a PR, ensure:

  • Code follows project style guidelines
  • All tests pass
  • New features include tests
  • Documentation is updated
  • Commit messages are clear and descriptive
  • Pre-commit hooks pass

PR Template

## Description
Brief description of changes

## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## Testing
Describe tests added/updated

## Checklist
- [ ] Tests pass
- [ ] Code style checks pass
- [ ] Documentation updated

Code Review

  • Be constructive and respectful
  • Focus on code quality and correctness
  • Address review comments promptly
  • Request changes when necessary

Questions or Issues?

Feel free to open an issue or ask in discussions for any questions!