The standard Playwright approach for Electron apps — point executablePath at the binary and let Playwright manage the process lifecycle.
Pros:
- First-class Playwright API —
app.evaluate()runs code in the main process, not just renderer - Automatic process lifecycle management (launch, close, cleanup)
- Access to Electron-specific APIs (e.g.,
app.evaluate(() => process.env)) - Well-documented, widely used for Electron testing
Cons:
- Does not work with Obsidian. Obsidian's executable is a launcher that loads an
.asarpackage (obsidian-1.11.7.asar) and forks a new Electron process. Playwright connects to the initial process, which exits, causingkill EPERMand connection failures. - No workaround without modifying Obsidian's startup or using a custom Electron shell
Verdict: Not viable for Obsidian.
Launch Obsidian as a subprocess with --remote-debugging-port=9222, then connect via Chrome DevTools Protocol.
Pros:
- Works with Obsidian's forked process architecture — the debug port is inherited by the child process
- Full access to renderer via
page.evaluate()— Obsidian's globalappobject is available - Keyboard/mouse interaction works normally
- Can take screenshots, traces, and use all Playwright assertions
- Process is managed explicitly — clear control over startup and teardown
Cons:
- No main process access (can't call Electron APIs directly, only renderer-side
window/app) - Must manually manage process lifecycle (spawn, pkill, port polling)
- Fixed debug port (9222) means tests can't run in parallel across multiple Obsidian instances without port management
- Port polling adds ~2-5s startup overhead
pkill -f Obsidianin setup is aggressive — kills ALL Obsidian instances, not just test ones
Verdict: Works well for PoC. Sufficient for single-worker CI/local testing.
Obsidian has no official testing framework. Some community approaches exist (e.g., obsidian-jest, hot-reload-based testing), but none are mature or maintained.
Verdict: Not a real option today.
app.plugins.plugins["@discourse-graph/obsidian"]— check plugin loadedapp.vault.getMarkdownFiles()— list filesapp.vault.read(file)— read file contentapp.vault.create(name, content)— create filesapp.workspace.openLinkText(path, "", false)— open a file in the editorapp.commands.executeCommandById(id)— could execute commands directly (alternative to command palette UI)
Commands are registered with IDs like @discourse-graph/obsidian:create-discourse-node. The command palette shows them as "Discourse Graph: Create discourse node".
The ModifyNodeModal renders React inside Obsidian's .modal-container:
- Node type:
<select>element (.modal-container select) - Content:
<input type="text">(.modal-container input[type='text']) - Confirm:
<button class="mod-cta">
Minimum config for plugin to load:
.obsidian/community-plugins.json→["@discourse-graph/obsidian"].obsidian/app.json→{"livePreview": true}(restricted mode must be off, but this is handled by Obsidian detecting the plugins dir)- Plugin files (
main.js,manifest.json,styles.css) in.obsidian/plugins/@discourse-graph/obsidian/
AI coding agents (Cursor, Claude Code) can run pnpm test:e2e after making changes to automatically verify features work end-to-end. The test suite should be comprehensive enough to catch regressions, fast enough to run frequently, and deterministic enough to trust the results.
Isolation improvements:
- Use a unique temp directory per test run (
os.tmpdir()) instead of a fixedtest-vault/path to avoid stale state - Use a random debug port to allow parallel runs
- Replace
pkill -f Obsidianwith tracking the specific child PID — parse it fromlsof -i :<port>after launch - Add a global setup/teardown in Playwright config to manage the single Obsidian instance across all tests
Reliability improvements:
- Replace
waitForTimeout()calls with proper waitFor conditions (e.g.,waitForSelector,waitForFunction) - Add retry logic for CDP connection (currently fails hard on timeout)
- Add a
test.beforeEachthat resets vault state (delete all non-config files) instead of full vault recreation
Core plugin features to test:
- Create each discourse node type (Question, Claim, Evidence, Source)
- Verify frontmatter (
nodeTypeId) is set correctly - Verify file naming conventions (e.g.,
QUE -,CLM -,EVD -,SRC -) - Open node type menu via hotkey (
Cmd+\) - Discourse context view toggle
- Settings panel opens and renders
Vault-level tests:
- Create multiple nodes and verify they appear in file explorer
- Verify node format regex matching (files follow the format pattern)
Use app.commands.executeCommandById() as the primary way to trigger commands — faster, more reliable, and avoids flaky command palette typing. Reserve command palette tests for testing the palette itself.
For agents to use the tests effectively:
-
Fast feedback loop — Tests should complete in <30s total. Current PoC is ~14s for 2 tests, which is good. Keep Obsidian running between test files using Playwright's
globalSetup/globalTeardown. -
Clear error messages — When a test fails, the agent needs to understand WHY. Add descriptive assertion messages:
expect( pluginLoaded, "Plugin should be loaded — check dist/ is built and plugin ID matches manifest.json", ).toBe(true);
-
Screenshot-on-failure for visual debugging — Already configured. Consider adding
page.screenshot()at key checkpoints even on success, so agents can visually verify state. -
Test file organization — One test file per feature area:
e2e/tests/ ├── plugin-load.spec.ts # Plugin loads, settings exist ├── node-creation.spec.ts # Create each node type ├── command-palette.spec.ts # Command palette interactions ├── discourse-context.spec.ts # Context view, relations └── settings.spec.ts # Settings panel -
CI integration — Run in GitHub Actions with a macOS runner. Obsidian would need to be pre-installed on the runner (or downloaded in a setup step). This is the biggest open question — Obsidian doesn't have a headless mode, so CI would need
xvfbor a virtual display. -
Agent-executable test commands:
pnpm test:e2e # run all tests pnpm test:e2e -- --grep "node creation" # run specific tests pnpm test:e2e:ui # interactive Playwright UI (for humans)
- Visual regression testing — Compare screenshots against baselines to catch UI regressions
- Obsidian version matrix — Test against multiple Obsidian versions (download different
.asarfiles) - Headless mode wrapper — Investigate running Obsidian with
--disable-gpu --headlessflags (may not work due to Obsidian's renderer requirements) - Test data fixtures — Pre-built vaults with specific node/relation configurations for testing complex scenarios
- Performance benchmarks — Measure plugin load time, command execution time
- CI runner setup — How to install Obsidian on GitHub Actions macOS runners? Is there a
.dmgdownload URL that's stable? Or do we cache the.appbundle? - Obsidian updates — Obsidian auto-updates the
.asar. Should tests pin a specific version? How to prevent auto-update during test runs? - Multiple vaults — Obsidian tracks known vaults globally. Test vaults may accumulate in Obsidian's vault list. Need cleanup strategy.
- Restricted mode — The PoC doesn't explicitly disable restricted mode via config. The plugin loads because the
community-plugins.jsonfile is present, but a fresh Obsidian install might prompt the user to enable community plugins. Need to investigate if there's a config flag to skip this.