Add list_extra_metadata() and improve   clear_extra_metadata()  to prevent silent metadata accumulation

## Description
 
Currently, `AnophelesSnpData` (and related classes via `sample_metadata.py`) allows users to add custom metadata using `add_extra_metadata()`. However, the current implementation has a few UX and state-management pitfalls that can lead to silent errors, particularly in notebook environments.
 
---
 
## The Problem
 
### 1. Silent Accumulation in Notebooks
 
`add_extra_metadata()` simply appends the provided DataFrame to the `self._extra_metadata` list. If a user re-runs a Jupyter notebook cell containing this call, the same extra metadata is appended multiple times. When `sample_metadata()` is called later, the repeated `.merge()` operations will result in duplicated columns (e.g., `feature_x`, `feature_y`), silently breaking downstream queries referencing the original column name.
 
### 2. No Visibility
 
There is currently no `list_extra_metadata()` or `has_extra_metadata()` method. Users have no way to inspect the internal API state to see what extra metadata is currently active.
 
### 3. Stale Cache Risk
 
While `sample_metadata()` merges the extra metadata after the initial `_cache_sample_metadata` hit (and `merge()` creates a new DataFrame), relying on `clear_extra_metadata()` to solely reset `self._extra_metadata = []` leaves a theoretical risk if cached DataFrames were ever mutated in-place by edge-case operations.
 
---
 
## Steps to Reproduce the Risk
 
```python
import malariagen_data
import pandas as pd
 
ag3 = malariagen_data.Ag3()
 
# User creates some custom metadata
df_custom = pd.DataFrame({
    "sample_id": ["AR0001-C", "AR0002-C"],
    "my_custom_trait": [True, False]
})
 
# User accidentally re-runs this notebook cell twice
ag3.add_extra_metadata(df_custom, on="sample_id")
ag3.add_extra_metadata(df_custom, on="sample_id")
 
# No way to check what is active:
# ag3.list_extra_metadata()  <-- DOES NOT EXIST
 
# The resulting DataFrame now has 'my_custom_trait_x' and 'my_custom_trait_y'
df_samples = ag3.sample_metadata(sample_sets="AG1000G-AO")
print([c for c in df_samples.columns if "my_custom_trait" in c])
# Output: ['my_custom_trait_x', 'my_custom_trait_y']
```
 
---
 
## Proposed Solution
 
### 1. Add a `list_extra_metadata()` method
 
Add a `list_extra_metadata()` method (or a `.extra_metadata_info` property) that returns the names of the columns currently active in the extra metadata list, or the shapes of the registered extra DataFrames, so users can inspect current state at any time.
 
```python
ag3.list_extra_metadata()
# e.g. returns: [{'columns': ['my_custom_trait'], 'shape': (2, 2)}]
```
 
### 2. Improve `add_extra_metadata()` safety
 
Check if the columns being added already exist in the currently registered extra metadata. Either warn the user or automatically overwrite instead of blindly appending:
 
```python
# Option A: Warn on duplicate columns
warnings.warn(
    "Column 'my_custom_trait' is already registered in extra metadata. "
    "Call clear_extra_metadata() first, or use overwrite=True."
)
 
# Option B: Support an explicit overwrite flag
ag3.add_extra_metadata(df_custom, on="sample_id", overwrite=True)
```
 
### 3. Update `clear_extra_metadata()`
 
Have it optionally (or explicitly) also clear `self._cache_sample_metadata` to guarantee a pristine state when the user wants to reset their environment:
 
```python
ag3.clear_extra_metadata(clear_cache=True)
```
 
---
 
## Expected Behavior
 
Users should be able to:
 
- Inspect their active custom metadata via `list_extra_metadata()` or an equivalent property.
- Be warned (or blocked) when attempting to register duplicate columns.
- Fully reset API state via `clear_extra_metadata()` without risk of stale cached data persisting.
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add list_extra_metadata() and improve clear_extra_metadata() to prevent silent metadata accumulation #1168

Description

The Problem

1. Silent Accumulation in Notebooks

2. No Visibility

3. Stale Cache Risk

Steps to Reproduce the Risk

Proposed Solution

1. Add a `list_extra_metadata()` method

2. Improve `add_extra_metadata()` safety

3. Update `clear_extra_metadata()`

Expected Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add list_extra_metadata() and improve clear_extra_metadata() to prevent silent metadata accumulation #1168

Description

Description

The Problem

1. Silent Accumulation in Notebooks

2. No Visibility

3. Stale Cache Risk

Steps to Reproduce the Risk

Proposed Solution

1. Add a list_extra_metadata() method

2. Improve add_extra_metadata() safety

3. Update clear_extra_metadata()

Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Add a `list_extra_metadata()` method

2. Improve `add_extra_metadata()` safety

3. Update `clear_extra_metadata()`