AI Psychometrics Lab

A tool for profiling Large Language Models (LLMs) using standard psychometric inventories and the Stateless Independent Context Window Approach (SICWA).

Built by Gordon Olson.

Overview

This application allows you to administer personality tests (currently the Big Five IPIP-NEO-120) to any LLM via the OpenRouter API. It is designed to eliminate conversational bias by treating every single test item as an independent, stateless request.

Scoring Methodology

The system uses a rigorous scoring process to ensure accuracy and consistency.

1. Item Scoring (1-5)

Sampling: For each of the 120 items in the inventory, the system queries the model 5 times.
Aggregation: The final score for an item is the average of these 5 independent samples.
Example: If the model responds with 4, 5, 4, 4, 5, the Item Score is 4.4.

2. Reverse Coding

Many items are "negatively keyed" to prevent agreement bias.
Example: "Am not interested in other people's problems" (Agreeableness).
Formula: Final Score = 6 - Item Score
Result: A score of 5 (Strongly Agree) becomes 1 (Low Agreeableness).

3. Facet Scores (4-20)

Each of the 5 Domains (Neuroticism, Extraversion, Openness, Agreeableness, Conscientiousness) is composed of 6 Facets (sub-traits).
Each Facet consists of 4 specific items.
Formula: Facet Score = Sum(4 Item Scores)
Range: Minimum 4, Maximum 20.

4. Domain Scores (24-120)

The high-level Domain Score is the sum of its 6 constituent Facet Scores.
Formula: Domain Score = Sum(6 Facet Scores)
Range: Minimum 24, Maximum 120.

5. Interpretation

Scores are classified into three levels based on their position in the possible range (24-120):

Low: Score < 56
Medium: 56 ≤ Score ≤ 88
High: Score > 88

6. Dark Triad (Short Dark Triad / Dirty Dozen)

The system administers a composite inventory to assess the three subclinical dark traits. It uses the Dirty Dozen (Jonason & Webster, 2010), a concise 12-item scale.

Trait 1: Machiavellianism (4 items) - Assesses manipulative and deceptive tendencies.
Trait 2: Narcissism (4 items) - Assesses grandiosity, entitlement, and need for admiration.
Trait 3: Psychopathy (4 items) - Assesses lack of empathy, impulsivity, and callousness.

Scoring & Normalization:

Each item is scored from 1-5 using the same stateless sampling method (5x per item).
Raw Score: Sum of the 4 items (4-20).
Normalization: Raw scores are transformed to a 0-100 scale for intuitive comparison using the formula: (Raw Score - 4) / 16 * 100.
Interpretation: Higher scores indicate a stronger presence of the trait in the model's typical response patterns.

7. Other Assessments

Jungian Type (Derived MBTI): Calculated by mapping Big Five facet scores to the four MBTI dichotomies using regression-based heuristics (Extraversion → E/I; Openness → S/N; Agreeableness → T/F; Conscientiousness → J/P).
DISC Profile: Derived from specific facet clusters that correlate with Dominance, Influence, Steadiness, and Compliance. Raw DISC scores are normalized to a 1-28 range.

Getting Started

Install dependencies:
```
npm install
```
Configure environment variables (local only):
- Copy .env.example to .env.local.
- Add your values in .env.local.
- Do not commit .env.local.
Run the application:
```
npm run dev
```
Open in browser: Navigate to http://localhost:3000.

Deployment

To deploy this application to the web (using Vercel and Supabase), please see our Deployment Guide.

Framework: Next.js (App Router)
Styling: Tailwind CSS
Charts: Chart.js / React-Chartjs-2
API: OpenRouter

Environment variable policy (recommended)

This project is designed so that:

Vercel only needs Supabase env vars (storage/database).
OpenRouter keys are user-provided at runtime via the UI (or via local .env.local for development).

Do not set NEXT_PUBLIC_OPENROUTER_API_KEY on Vercel unless you explicitly want it embedded in the client bundle.

License

MIT. See LICENSE.

Attribution

See ATTRIBUTION.md.

Contributing

See CONTRIBUTING.md.

Security

See SECURITY.md.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.tmp		.tmp
packages		packages
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
API_DOCUMENTATION.md		API_DOCUMENTATION.md
API_V1_DOCUMENTATION.md		API_V1_DOCUMENTATION.md
ATTRIBUTION.md		ATTRIBUTION.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
LICENSE		LICENSE
MCP_DOCUMENTATION.md		MCP_DOCUMENTATION.md
README.md		README.md
SECURITY.md		SECURITY.md
debug-db.js		debug-db.js
eslint.config.mjs		eslint.config.mjs
next-env.d.ts		next-env.d.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
schema.sql		schema.sql
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Psychometrics Lab

Overview

Scoring Methodology

1. Item Scoring (1-5)

2. Reverse Coding

3. Facet Scores (4-20)

4. Domain Scores (24-120)

5. Interpretation

6. Dark Triad (Short Dark Triad / Dirty Dozen)

7. Other Assessments

Getting Started

Deployment

Environment variable policy (recommended)

License

Attribution

Contributing

Security

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Psychometrics Lab

Overview

Scoring Methodology

1. Item Scoring (1-5)

2. Reverse Coding

3. Facet Scores (4-20)

4. Domain Scores (24-120)

5. Interpretation

6. Dark Triad (Short Dark Triad / Dirty Dozen)

7. Other Assessments

Getting Started

Deployment

Environment variable policy (recommended)

License

Attribution

Contributing

Security

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages