Conversation
|
TODO: stage and release data, then write tests |
|
Hi @tristanpwdennis, the As1 integration seems to follow the existing patterns well. While going through In some of the other test files, it looks like the existing species fixtures are kept and new ones are added alongside them. Would that be the intended approach here as well? |
|
Good spot - thanks @adilraza99, I will fix these |
|
@jonbrenas @ahernank - testing has hit an issue. As discussed we didn't generate contamination QC data for the samples. The tests fail as they are expecting comtam_pc and contam_llc columns. Rather than restructure the tests, I've added dummy data to the columns in the fixture. My rationale for this is that in the future, we will be able to include contamination data in sample set QC. Let me know if there are any issues around this. |
|
Hi @tristanpwdennis, yes, adding dummy data is the correct way to go. To be fair, all the data in the fixture is dummy data. |
|
Looks like tests are passing (for now) |
Merge branch 'add-As1' of https://github.com/tristanpwdennis/malariagen-data-python into add-As1
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1257 +/- ##
==========================================
- Coverage 88.92% 88.85% -0.07%
==========================================
Files 55 55
Lines 6257 6273 +16
==========================================
+ Hits 5564 5574 +10
- Misses 693 699 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hmm. Unable to reproduce this offline. Will investigate. |
|
Hi @tristanpwdennis, It looks like the CI failure in I noticed a few other test blocks in the same file already guard this case, for example L437, L572, L653. It seems this block (around L709–L716) does not have that check yet. Maybe it would make sense to handle it the same way here? |
|
@adilraza99 - thank you! I think this issue may be random and unrelated to As1 class - probably something to do with the annotations in the dirus assembly but we should open a separate issue about this and possibly implement the checks you suggest. |
jonbrenas
left a comment
There was a problem hiding this comment.
I don't think that the fixture needs the curation or the cohorts for all the sample sets, only the 4 (which is already a lot, Ag has 3) that are actually listed.
Also only 1365-VO-DJ-ADBI-VMF00318 has surveillance flags (and the file is empty) which sounds like a missed opportunity to test that they work fine.
The rest seems correct. @tristanpwdennis, did you test every function in the rst? Because the functions are tested on simulated data, some functions that have their input from GCS directly fail even when all tests pass.
|
Thanks @jonbrenas! I've had a tidy of the the fixture data - deleting the ghost surv flags (see sep discussion). I also added a flag allowing the dip clustering tests to be performed on classes without cnv data (eg as1, adir1, amin1). Think the .rst is covered now. |
jonbrenas
left a comment
There was a problem hiding this comment.
The rest looks OK. Hopefully, the tests would catch something going horribly wrong.
| @@ -1,89 +1,15 @@ | |||
| # `malariagen_data` - analyse MalariaGEN data from Python | |||
| # Curation metadata | |||
There was a problem hiding this comment.
No idea but it's been banished now
| Access data from Google Cloud Storage (default): | ||
|
|
||
| >>> import malariagen_data | ||
| >>> adir1 = malariagen_data.As1() |
There was a problem hiding this comment.
adir1 should be as1. Same on the next line.
|
Is there anything outstanding left on this @jonbrenas and @ahernank ? |
ahernank
left a comment
There was a problem hiding this comment.
Thanks @tristanpwdennis! Happy to merge.
See #706