Your data has answers. CellCog asks the right questions. #1 on DeepResearch Bench (Feb 2026) + frontier coding agent — upload messy CSVs with minimal prompting and get structured insights back: charts, dashboards, statistical reports, and clean data. Full Python access for data cleaning, exploratory analysis, visualization, hypothesis testing, ML model evaluation, and dataset profiling. Analyzes everything, presents it beautifully.
OpenClaw skills run inside an OpenClaw container. EasyClawd deploys and manages yours — no server setup needed.
- Added clear author and dependency metadata to SKILL.md. - Changed prerequisite wording to reference the `cellcog` skill directly. - No functional changes; documentation improvements only.
---
name: data-cog
description: "Your data has answers. CellCog asks the right questions. #1 on DeepResearch Bench (Feb 2026) + frontier coding agent — upload messy CSVs with minimal prompting and get structured insights back: charts, dashboards, statistical reports, and clean data. Full Python access for data cleaning, exploratory analysis, visualization, hypothesis testing, ML model evaluation, and dataset profiling. Analyzes everything, presents it beautifully."
metadata:
openclaw:
emoji: "🔢"
author: CellCog
dependencies: [cellcog]
---
# Data Cog - Your Data Has Answers, CellCog Finds Them
**Your data has answers. CellCog asks the right questions.** #1 on DeepResearch Bench (Feb 2026) + frontier coding agent.
Most AI tools return code when you ask about data. CellCog returns answers — actual charts, clean datasets, statistical reports, and visual dashboards. Upload messy CSVs with a minimal prompt, and CellCog's coding agent explores your data, finds the patterns, and presents them beautifully. Full Python access for everything from data cleaning to ML model evaluation.
---
## Prerequisites
This skill requires the `cellcog` skill for SDK setup and API calls.
```bash
clawhub install cellcog
```
**Read the cellcog skill first** for SDK setup. This skill shows you what's possible.
**Quick pattern (v1.0+):**
```python
# Fire-and-forget - returns immediately
result = client.create_chat(
prompt="Analyze this dataset: <SHOW_FILE>/path/to/data.csv</SHOW_FILE>",
notify_session_key="agent:main:main",
task_label="data-analysis",
chat_mode="agent" # Agent mode for most data work
)
# Daemon notifies you when complete - do NOT poll
```
---
## What Makes Data-Cog Different
### Code as Tool, Not as Output
Other AI tools give you Python code and say "run this." CellCog **runs the code for you** and delivers the results:
| Other AI Tools | Data-Cog |
|---------------|----------|
| "Here's a pandas script to analyze your data" | Here are your actual insights with charts |
| "Run this matplotlib code to see the chart" | Here's the chart, annotated with findings |
| "This SQL query will find outliers" | Found 23 outliers, here's what they mean |
| "You'll need scikit-learn for this" | Model trained, here's accuracy and feature importance |
You upload data. You get answers. The code runs behind the scenes.
---
## What Data Work You Can Do
### Exploratory Data Analysis
Understand your data fast:
- **Dataset Profiling**: "Analyze this CSV — distributions, missing values, outliers, correlations, and data quality summary"
- **Pattern Discovery**: "What patterns and trends exist in this sales data? Surprise me."
- **Anomaly Detection**: "Find unusual patterns in this server log data — what looks abnormal?"
- **Relationship Analysis**: "What factors most strongly correlate with customer churn in this dataset?"
**Example prompt:**
> "Analyze this dataset:
> <SHOW_FILE>/path/to/customer_data.csv</SHOW_FILE>
>
> I don't know much about this data yet. Give me:
> - Overview: rows, columns, data types, missing values
> - Key distributions and summary statistics
> - Most interesting correlations
> - Any outliers or data quality issues
> - 3-5 insights that jump out
>
> Present findings as an interactive HTML report with charts."
### Data Cleaning & Transformation
Wrangle messy data into shape:
- **Clean Messy Data**: "Clean this CSV — fix inconsistent date formats, handle missing values, remove duplicates, standardize column names"
- **Data Transformation**: "Pivot this transaction data into a monthly summary by product category"
- **Data Merging**: "Join these three CSV files on customer_id and create a unified dataset"
- **Feature Engineering**: "Create useful features from this raw data for predicting house prices"
**Example prompt:**
> "Clean and transform this dataset:
> <SHOW_FILE>/path/to/messy_data.csv</SHOW_FILE>
>
> Issues I know about:
> - Dates are in mixed formats (MM/DD/YYYY and YYYY-MM-DD)
> - 'Revenue' column has some values with $ signs and commas
> - Duplicate rows exist
> - Missing values in 'Region' column
>
> Clean it up and give me back a clean CSV plus a summary of what you changed."
### Statistical Analysis
Rigorous analysis with real numbers:
- **Hypothesis Testing**: "Is there a statistically significant difference in conversion rates between our A and B variants?"
- **Regression Analysis**: "What factors predict employee salary in this HR dataset? Build a regression model."
- **Time Series Analysis**: "Analyze this monthly revenue data — trend, seasonality, and forecast next 6 months"
- **Cohort Analysis**: "Create a cohort analysis showing user retention by signup month"
**Example prompt:**
> "I ran an A/B test on our checkout page:
> <SHOW_FILE>/path/to/ab_test_results.csv</SHOW_FILE>
>
> Columns: user_id, variant (A or B), converted (0/1), revenue, timestamp
>
> Tell me:
> - Is variant B statistically better? (p-value, confidence interval)
> - Conversion rate difference
> - Revenue per user difference
> - Sample size adequacy check
> - My recommendation: ship B or keep testing?
>
> Present with clear charts and a plain-English conclusion."
### Visualization & Reporting
Turn data into visual stories:
- **Chart Generation**: "Create a set of charts showing our quarterly performance from this data"
- **Dashboard Reports**: "Build an interactive dashboard from this sales dataset with filters by region and product"
- **Presentation-Ready Visuals**: "Create publication-quality charts from this research data"
- **Comparison Visuals**: "Visualize how our metrics compare to industry benchmarks"
### Machine Learning
Applied ML without the setup:
- **Classification**: "Predict which customers will churn based on this dataset — train a model, show feature importance"
- **Clustering**: "Segment these customers into groups based on behavior — how many natural clusters exist?"
- **Forecasting**: "Forecast next quarter's sales using this historical data"
- **Model Evaluation**: "I trained a model — here are the predictions. Evaluate: accuracy, precision, recall, confusion matrix, ROC curve"
**Example prompt:**
> "Predict customer churn from this dataset:
> <SHOW_FILE>/path/to/customer_features.csv</SHOW_FILE>
>
> Target column: 'churned'
>
> - Train a model, try at least 2 algorithms
> - Show feature importance — what drives churn?
> - Confusion matrix and ROC curve
> - Plain-English summary: 'The top 3 reasons customers churn are...'
> - Actionable recommendations based on findings
>
> I want insights, not just metrics."
---
## Supported Data Formats
| Format | How to Send |
|--------|-------------|
| **CSV** | Upload via SHOW_FILE |
| **Excel (XLSX)** | Upload via SHOW_FILE |
| **JSON** | Upload via SHOW_FILE |
| **Parquet** | Upload via SHOW_FILE |
| **SQL exports** | Upload the dump via SHOW_FILE |
| **Inline data** | Describe small datasets directly in prompt |
---
## Output Formats
| Format | Best For |
|--------|----------|
| **Interactive HTML Dashboard** | Explorable charts, filters, drill-downs |
| **PDF Report** | Shareable analysis reports with charts and findings |
| **Clean CSV/XLSX** | Cleaned or transformed data files for downstream use |
| **Markdown** | Quick insights for integration into docs |
---
## Chat Mode for Data
| Scenario | Recommended Mode |
|----------|------------------|
| Quick data cleaning, simple charts, basic statistics | `"agent"` |
| Deep analysis with multiple techniques, ML modeling, comprehensive reports | `"agent team"` |
**Use `"agent"` for most data work.** Data cleaning, EDA, chart generation, and standard statistical analysis execute well in agent mode.
**Use `"agent team"` for complex analytical projects** — multi-technique analysis, ML model comparisons, or when you need deep domain reasoning about what the data means.
---
## Example Prompts
**Minimal prompt, maximum insight:**
> "Analyze this:
> <SHOW_FILE>/path/to/data.csv</SHOW_FILE>
Read full documentation on ClawHub