@@ -732,8 +732,10 @@ def clear_extra_metadata(self):
732732 - ``terms_of_use_expiry_date`` - Expiry date of terms of use for the sample.
733733 - ``terms_of_use_url`` - URL of the terms of use for the sample.
734734 - ``unrestricted_use`` - Whether the sample can be used without restrictions.
735+ - ``is_surveillance`` - Whether the sample can be used for surveillance.
735736
736- **Sequence QC metadata**:
737+ **Sequence QC metadata** (present for all sample sets, values may
738+ be missing if QC data is unavailable for a given sample set):
737739
738740 - ``mean_cov`` - Mean sequencing coverage across the genome.
739741 - ``median_cov`` - Median sequencing coverage across the genome.
@@ -747,7 +749,7 @@ def clear_extra_metadata(self):
747749 - ``mean_cov_3L`` - Mean coverage on chromosome arm 3L.
748750 - ``median_cov_3L`` - Median coverage on chromosome arm 3L.
749751 - ``mode_cov_3L`` - Modal coverage on chromosome arm 3L.
750- - ``mean_cov_3R`` - Mean coverage on chromosome arm 3R.=
752+ - ``mean_cov_3R`` - Mean coverage on chromosome arm 3R.
751753 - ``median_cov_3R`` - Median coverage on chromosome arm 3R.
752754 - ``mode_cov_3R`` - Modal coverage on chromosome arm 3R.
753755 - ``mean_cov_X`` - Mean coverage on chromosome X.
@@ -758,25 +760,24 @@ def clear_extra_metadata(self):
758760 - ``contam_pct`` - Estimated contamination percentage.
759761 - ``contam_LLR`` - Log-likelihood ratio for contamination estimate.
760762
761- **Surveillance flags**:
762-
763- - ``is_surveillance`` - Whether the sample can be used for surveillance.
764-
765- **AIM (Ancestry-Informative Marker) metadata** (if available):
763+ **AIM (Ancestry-Informative Marker) metadata** (only present when
764+ an AIM analysis is available for the data resource, e.g., *Ag3*):
766765
767- - ``aim_species_fraction_arab`` - Fraction of gambcolu vs. arabiensis AIMs
768- indicating arabiensis.
766+ - ``aim_species_fraction_arab`` - Fraction of gambcolu vs. arabiensis
767+ AIMs indicating arabiensis.
769768 - ``aim_species_fraction_colu`` - Fraction of gambiae vs. coluzzii AIMs
770769 indicating coluzzii.
771- - ``aim_species_fraction_colu_no2l`` - Fraction of gambiae vs. coluzzii AIMs
772- indicating coluzzii, excluding chromosome arm 2L.
770+ - ``aim_species_fraction_colu_no2l`` - Fraction of gambiae vs. coluzzii
771+ AIMs indicating coluzzii, excluding chromosome arm 2L.
773772 - ``aim_species_gambcolu_arabiensis`` - Taxon assigned by gambcolu vs.
774773 arabiensis AIMs.
775774 - ``aim_species_gambiae_coluzzii`` - Taxon assigned by gambiae vs.
776775 coluzzii AIMs.
777776 - ``aim_species`` - Final species assignment combining both AIM analyses.
778777
779- **Cohort metadata** (if available):
778+ **Cohort metadata** (only present when a cohorts analysis is available
779+ for the data resource; quarter columns are only present for cohorts
780+ analyses from 20230223 onwards):
780781
781782 - ``country_iso`` - ISO code of the country of collection.
782783 - ``admin1_name`` - Name of the first-level administrative region.
@@ -785,14 +786,16 @@ def clear_extra_metadata(self):
785786 - ``taxon`` - Taxon assigned by combining AIM and cohort analyses.
786787 - ``cohort_admin1_year`` - Cohort grouping by admin level 1 and year.
787788 - ``cohort_admin1_month`` - Cohort grouping by admin level 1 and month.
788- - ``cohort_admin1_quarter`` - Cohort grouping by admin level 1 and quarter.
789+ - ``cohort_admin1_quarter`` - Cohort grouping by admin level 1 and
790+ quarter (cohorts analysis >= 20230223 only).
789791 - ``cohort_admin2_year`` - Cohort grouping by admin level 2 and year.
790792 - ``cohort_admin2_month`` - Cohort grouping by admin level 2 and month.
791- - ``cohort_admin2_quarter`` - Cohort grouping by admin level 2 and quarter.
793+ - ``cohort_admin2_quarter`` - Cohort grouping by admin level 2 and
794+ quarter (cohorts analysis >= 20230223 only).
792795
793- The exact columns present depend on the sample sets requested and
794- which analyses are available . The returned DataFrame is a copy and
795- can be safely modified without affecting internal caches.
796+ The exact columns present depend on the data resource and sample sets
797+ requested . The returned DataFrame is a copy and can be safely modified
798+ without affecting internal caches.
796799 """ ,
797800 )
798801 def sample_metadata (
0 commit comments