244 lines
4.3 KiB
Markdown
244 lines
4.3 KiB
Markdown
# Contributing to TCGA Downloader
|
|
|
|
Thank you for your interest in contributing! This document provides guidelines and instructions for contributing to the project.
|
|
|
|
## Getting Started
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.11 or higher
|
|
- pip or uv
|
|
- gdc-client (for testing downloads)
|
|
- Git
|
|
|
|
### Setting Up Development Environment
|
|
|
|
1. Fork the repository
|
|
2. Clone your fork:
|
|
|
|
```bash
|
|
git clone <your-fork-url>
|
|
cd tcga-downloader
|
|
```
|
|
|
|
3. Install the package in development mode:
|
|
|
|
```bash
|
|
pip install -e ".[dev]"
|
|
```
|
|
|
|
4. Install and setup pre-commit hooks:
|
|
|
|
```bash
|
|
pip install pre-commit
|
|
pre-commit install
|
|
```
|
|
|
|
## Development Workflow
|
|
|
|
### Branch Strategy
|
|
|
|
- `main`: Production-ready code
|
|
- Create feature branches from `main`: `feature/your-feature-name`
|
|
- Use descriptive branch names
|
|
|
|
### Making Changes
|
|
|
|
1. Create a new branch for your feature:
|
|
|
|
```bash
|
|
git checkout -b feature/your-feature-name
|
|
```
|
|
|
|
2. Make your changes following the coding standards (see below)
|
|
|
|
3. Run tests and linting:
|
|
|
|
```bash
|
|
pytest
|
|
black .
|
|
ruff check --fix .
|
|
mypy tcga_downloader
|
|
```
|
|
|
|
4. Commit your changes with descriptive messages:
|
|
|
|
```bash
|
|
git add .
|
|
git commit -m "feat: add new filtering option for sample type"
|
|
```
|
|
|
|
5. Push to your fork:
|
|
|
|
```bash
|
|
git push origin feature/your-feature-name
|
|
```
|
|
|
|
6. Create a Pull Request
|
|
|
|
## Coding Standards
|
|
|
|
### Code Style
|
|
|
|
This project uses:
|
|
- **Black**: Code formatting (line length: 100)
|
|
- **Ruff**: Linting
|
|
- **mypy**: Type checking
|
|
|
|
Before committing, run:
|
|
|
|
```bash
|
|
black .
|
|
ruff check --fix .
|
|
mypy tcga_downloader
|
|
```
|
|
|
|
Or use pre-commit hooks which run automatically:
|
|
|
|
```bash
|
|
pre-commit run --all-files
|
|
```
|
|
|
|
### Type Hints
|
|
|
|
All public functions should include type hints. Use standard library types:
|
|
|
|
```python
|
|
from typing import List, Optional
|
|
|
|
def query_files(
|
|
project: str,
|
|
data_type: str,
|
|
fields: Optional[List[str]] = None,
|
|
) -> List[dict]:
|
|
...
|
|
```
|
|
|
|
### Logging
|
|
|
|
Use the logging module instead of print statements:
|
|
|
|
```python
|
|
from tcga_downloader.logger import get_logger
|
|
|
|
logger = get_logger("module_name")
|
|
logger.info("Processing file: %s", filename)
|
|
logger.debug("Detailed debug information")
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
Use descriptive error messages and appropriate exception types:
|
|
|
|
```python
|
|
if not manifest_path.exists():
|
|
raise FileNotFoundError(f"Manifest file not found: {manifest_path}")
|
|
```
|
|
|
|
## Testing
|
|
|
|
### Running Tests
|
|
|
|
```bash
|
|
pytest
|
|
```
|
|
|
|
With coverage:
|
|
|
|
```bash
|
|
pytest --cov=tcga_downloader --cov-report=html --cov-report=term
|
|
```
|
|
|
|
### Writing Tests
|
|
|
|
- Place tests in the `tests/` directory
|
|
- Name test files: `test_<module>.py`
|
|
- Name test functions: `test_<function>_<scenario>`
|
|
|
|
Example:
|
|
|
|
```python
|
|
def test_manifest_roundtrip_tsv(tmp_path):
|
|
records = [ManifestRecord(...)]
|
|
path = tmp_path / "m.tsv"
|
|
write_manifest(records, path, fmt="tsv")
|
|
loaded = load_manifest(path)
|
|
assert loaded == records
|
|
```
|
|
|
|
### Test Coverage
|
|
|
|
Aim for high test coverage. New features should include tests.
|
|
|
|
## Documentation
|
|
|
|
### Code Documentation
|
|
|
|
Use docstrings for public APIs:
|
|
|
|
```python
|
|
def build_filters(project: str, data_type: str) -> dict:
|
|
"""Build GDC API filters for querying files.
|
|
|
|
Args:
|
|
project: TCGA project ID (e.g., "TCGA-BRCA")
|
|
data_type: Data type to filter (e.g., "Gene Expression")
|
|
|
|
Returns:
|
|
Filter dictionary for GDC API query
|
|
"""
|
|
...
|
|
```
|
|
|
|
### Documentation Updates
|
|
|
|
When adding new features:
|
|
1. Update README.md with usage examples
|
|
2. Add type hints
|
|
3. Update relevant sections in documentation
|
|
|
|
## Pull Request Process
|
|
|
|
### PR Checklist
|
|
|
|
Before submitting a PR, ensure:
|
|
|
|
- [ ] Code follows project style guidelines
|
|
- [ ] All tests pass
|
|
- [ ] New features include tests
|
|
- [ ] Documentation is updated
|
|
- [ ] Commit messages are clear and descriptive
|
|
- [ ] Pre-commit hooks pass
|
|
|
|
### PR Template
|
|
|
|
```markdown
|
|
## Description
|
|
Brief description of changes
|
|
|
|
## Type of Change
|
|
- [ ] Bug fix
|
|
- [ ] New feature
|
|
- [ ] Breaking change
|
|
- [ ] Documentation update
|
|
|
|
## Testing
|
|
Describe tests added/updated
|
|
|
|
## Checklist
|
|
- [ ] Tests pass
|
|
- [ ] Code style checks pass
|
|
- [ ] Documentation updated
|
|
```
|
|
|
|
## Code Review
|
|
|
|
- Be constructive and respectful
|
|
- Focus on code quality and correctness
|
|
- Address review comments promptly
|
|
- Request changes when necessary
|
|
|
|
## Questions or Issues?
|
|
|
|
Feel free to open an issue or ask in discussions for any questions!
|