-
Notifications
You must be signed in to change notification settings - Fork 178
Add As1 support #1257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add As1 support #1257
Changes from 19 commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
31a0c31
init As1 class file hooray
tristanpwdennis d7fa6df
add docs
tristanpwdennis 76c5d89
Update base class, init, start adding test coverage
tristanpwdennis e529a78
Merge branch 'malariagen:master' into add-As1
tristanpwdennis b57322f
merge
tristanpwdennis 569105e
test genome features
tristanpwdennis 7851553
accidentally mangled conftest, thanks claude code, working now
tristanpwdennis 38fe863
Merge branch 'master' into add-As1
tristanpwdennis ceaf546
fix phasing analysis misspec
tristanpwdennis 774e93f
x
tristanpwdennis d761029
add index entry and grid image
tristanpwdennis c9e8202
ci: trigger test run
tristanpwdennis a5ffbdf
Merge branch 'master' into add-As1
tristanpwdennis 9d1cba2
add tests for as1. add cnv flag to skip tests for classes without cn…
tristanpwdennis 46ecca7
Merge branch 'add-As1' of https://github.com/tristanpwdennis/malariag…
tristanpwdennis fed7e9f
tidy metadata in fixture
tristanpwdennis 156b5fe
remove ghostly surv flags from abdi
tristanpwdennis e4d94c9
oops readding curation now
tristanpwdennis e3cd4dd
re-add dummy qc cols to test data
tristanpwdennis 7535986
fix readme
tristanpwdennis 97017b4
fix more typosd
tristanpwdennis a09b28d
Merge branch 'master' into add-As1
jonbrenas a146798
Merge branch 'master' into add-As1
ahernank File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,89 +1,15 @@ | ||
| # `malariagen_data` - analyse MalariaGEN data from Python | ||
| # Curation metadata | ||
|
|
||
| This Python package provides methods for accessing and analysing data from MalariaGEN. | ||
| Summary statistics used during our sequence QC process are available within each sample set subdirectory, in a file named "sequence_qc_stats.csv". Each file contains the following fields: | ||
|
|
||
| ## Installation | ||
| - `sample_id` (string) - MalariaGEN sample identifier | ||
| - `mean_cov` (float) - mean coverage | ||
| - `median_cov` (int) - median coverage | ||
| - `modal_cov` (int) - modal coverage | ||
| - `mean_cov_{contig}` (float) - mean coverage for a particular contig | ||
| - `median_cov_{contig}` (int) - median coverage for a particular contig | ||
| - `mode_cov_{contig}` (int) - modal coverage for a particular contig | ||
| - `frac_gen_cov` (float) - fraction of the genome covered | ||
| - `divergence` (float) - divergence | ||
|
|
||
| The `malariagen_data` Python package is available from the Python | ||
| package index (PyPI) and can be installed via `pip`, e.g.: | ||
|
|
||
| ```bash | ||
| pip install malariagen-data | ||
| ``` | ||
|
|
||
| ## Documentation | ||
|
|
||
| Documentation of classes and methods in the public API are available | ||
| from the following locations: | ||
|
|
||
| - [Ag3 API | ||
| docs](https://malariagen.github.io/malariagen-data-python/latest/Ag3.html) | ||
|
|
||
| - [Af1 API | ||
| docs](https://malariagen.github.io/malariagen-data-python/latest/Af1.html) | ||
|
|
||
| - [Amin1 API | ||
| docs](https://malariagen.github.io/malariagen-data-python/latest/Amin1.html) | ||
|
|
||
| - [Adir1 API | ||
| docs](https://malariagen.github.io/malariagen-data-python/latest/Adir1.html) | ||
|
|
||
| - [Pf8 API | ||
| docs](https://malariagen.github.io/parasite-data/pf8/api.html) | ||
|
|
||
| - [Pf7 API | ||
| docs](https://malariagen.github.io/parasite-data/pf7/api.html) | ||
|
|
||
| - [Pv4 API | ||
| docs](https://malariagen.github.io/parasite-data/pv4/api.html) | ||
|
|
||
| ## Release notes (change log) | ||
|
|
||
| See [GitHub releases](https://github.com/malariagen/malariagen-data-python/releases) | ||
| for release notes. | ||
|
|
||
| ## Developer setup | ||
|
|
||
| To get setup for development, see [this video if you prefer VS Code](https://youtu.be/zddl3n1DCFM), or [this older video if you prefer PyCharm](https://youtu.be/QniQi-Hoo9A). | ||
|
|
||
| For detailed setup instructions, see: | ||
| - [Linux setup guide](LINUX_SETUP.md) | ||
| - [macOS setup guide](MACOS_SETUP.md) | ||
| - [Windows setup guide](WINDOWS_SETUP.md) | ||
| - [Google Colab (TPU) setup guide](docs/source/colab_tpu_runtime.rst) | ||
| Detailed instructions can be found in the [Contributors guide](https://github.com/malariagen/malariagen-data-python/blob/master/CONTRIBUTING.md). | ||
|
|
||
| ## AI use policy and guidelines | ||
|
|
||
| See [AI use policy and guidelines](https://github.com/malariagen/malariagen-data-python/blob/master/AI-POLICY.md) for more details. | ||
|
|
||
| ## Release process | ||
|
|
||
| Create a new GitHub release. That's it. This will automatically | ||
| trigger publishing of a new release to PyPI and a new version of | ||
| the documentation via GitHub Actions. | ||
|
|
||
| The version switcher for the documentation can then be updated by | ||
| modifying the `docs/source/_static/switcher.json` file accordingly. | ||
|
|
||
| ## Citation | ||
|
|
||
| If you use the `malariagen_data` package in a publication | ||
| or include any of its functions or code in other materials (_e.g._ training resources), | ||
| please cite: [doi.org/10.5281/zenodo.11173411](https://doi.org/10.5281/zenodo.11173411) | ||
|
|
||
| Some functions may require additional citations to acknowledge specific contributions. These are indicated in the description for each relevant function. | ||
|
|
||
| For any questions, please feel free to contact us at: [support@malariagen.net](mailto:support@malariagen.net) | ||
|
|
||
|
|
||
| ## Sponsorship | ||
|
|
||
| This project is currently supported by the following grants: | ||
|
|
||
| * [BMGF INV-068808](https://www.gatesfoundation.org/about/committed-grants/2024/04/inv-068808) | ||
| * [BMGF INV-062921](https://www.gatesfoundation.org/about/committed-grants/2024/07/inv-062921) | ||
|
|
||
| This project was previously supported by the following grants: | ||
|
|
||
| * [BMGF INV-001927](https://www.gatesfoundation.org/about/committed-grants/2019/11/inv001927) | ||
| For further information or queries contact support@malariagen.net. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,141 @@ | ||
| As1 | ||
| ===== | ||
|
|
||
| This page provides a curated list of functions and properties available in the ``malariagen_data`` API | ||
| for data on *Anopheles stephensi* species mosquitoes. | ||
|
|
||
| To set up the API, use the following code:: | ||
|
|
||
| import malariagen_data | ||
| as1 = malariagen_data.As1() | ||
|
|
||
| All the functions below can then be accessed as methods on the ``as1`` object. E.g., to call the | ||
| ``sample_metadata()`` function, do:: | ||
|
|
||
| df_samples = as1.sample_metadata() | ||
|
|
||
| For more information about the data and terms of use, please see the | ||
| `MalariaGEN website <https://www.malariagen.net/data>`_ or contact support@malariagen.net. | ||
|
|
||
| .. currentmodule:: malariagen_data.as1.As1 | ||
|
|
||
| Basic data access | ||
| ----------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| releases | ||
| sample_sets | ||
| lookup_release | ||
| lookup_study | ||
|
|
||
| Reference genome data access | ||
| ---------------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| contigs | ||
| genome_sequence | ||
| genome_features | ||
| plot_transcript | ||
| plot_genes | ||
|
|
||
| Sample metadata access | ||
| ---------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| sample_metadata | ||
| add_extra_metadata | ||
| clear_extra_metadata | ||
| lookup_sample | ||
| count_samples | ||
| plot_samples_bar | ||
| plot_samples_interactive_map | ||
| plot_sample_location_mapbox | ||
| plot_sample_location_geo | ||
| wgs_data_catalog | ||
| cohorts | ||
|
|
||
| SNP data access | ||
| --------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| site_mask_ids | ||
| snp_calls | ||
| snp_allele_counts | ||
| plot_snps | ||
| site_annotations | ||
| is_accessible | ||
| biallelic_snp_calls | ||
| biallelic_diplotypes | ||
| biallelic_snps_to_plink | ||
|
|
||
| SNP frequency analysis | ||
| ---------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| snp_allele_frequencies | ||
| snp_allele_frequencies_advanced | ||
| aa_allele_frequencies | ||
| aa_allele_frequencies_advanced | ||
| plot_frequencies_heatmap | ||
| plot_frequencies_time_series | ||
| plot_frequencies_interactive_map | ||
|
|
||
| Principal components analysis (PCA) | ||
| ----------------------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| pca | ||
| plot_pca_variance | ||
| plot_pca_coords | ||
| plot_pca_coords_3d | ||
|
|
||
| Genetic distance and neighbour-joining trees (NJT) | ||
| -------------------------------------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| plot_njt | ||
| njt | ||
| biallelic_diplotype_pairwise_distances | ||
|
|
||
| Heterozygosity analysis | ||
| ----------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| plot_heterozygosity | ||
| roh_hmm | ||
| plot_roh | ||
|
|
||
| Diversity analysis | ||
| ------------------ | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| cohort_diversity_stats | ||
| diversity_stats | ||
| plot_diversity_stats | ||
|
|
||
| Diplotype clustering | ||
| -------------------- | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| plot_diplotype_clustering | ||
|
|
||
| Fst analysis | ||
| ------------ | ||
| .. autosummary:: | ||
| :toctree: generated/ | ||
|
|
||
| average_fst | ||
| pairwise_average_fst | ||
| plot_pairwise_average_fst | ||
| fst_gwss | ||
| plot_fst_gwss |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea but it's been banished now