Skip to content

Add OSA chat widget to documentation#13702

Open
neuromechanist wants to merge 6 commits intomne-tools:mainfrom
neuromechanist:add-osa-chat-widget
Open

Add OSA chat widget to documentation#13702
neuromechanist wants to merge 6 commits intomne-tools:mainfrom
neuromechanist:add-osa-chat-widget

Conversation

@neuromechanist
Copy link
Contributor

Summary

  • Adds the Open Science Assistant (OSA) chat widget to the MNE-Python documentation
  • The widget provides an AI assistant specialized in MNE-Python that can help users with MEG, EEG, and neurophysiological data analysis questions
  • Loads the widget script from the OSA CDN and configures it with 3 suggested questions

Details

The widget is a lightweight floating chat button that appears on all documentation pages. It:

  • Uses the mne community configuration from OSA
  • Shows 3 suggested starter questions relevant to MNE-Python users
  • Supports page context awareness (sends current page URL to help provide contextually relevant answers)
  • Is non-intrusive; users can dismiss or ignore it

The widget is served from demo.osc.earth and configured via the communityId: 'mne' setting, which auto-configures the API endpoint, title, theme color, and initial greeting.

Test plan

  • CORS origins for mne.tools and *.mne.tools are enabled on the OSA backend
  • Verify widget loads on local Sphinx build (make html)
  • Verify chat responses work end-to-end
  • Check widget appearance on mobile viewports

@welcome
Copy link

welcome bot commented Feb 27, 2026

Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴

@neuromechanist
Copy link
Contributor Author

Hi maintainers! Could someone approve the CircleCI pipeline? It requires maintainer approval for first-time contributors from forks.

Also, if you'd like to customize the widget further (e.g., adding your own logo), you can pass additional options to OSAChatWidget.setConfig(). For example:

OSAChatWidget.setConfig({
  communityId: 'mne',
  logo: 'https://mne.tools/stable/_static/mne_logo.svg',
  suggestedQuestions: [...]
});

Full configuration reference: https://docs.osc.earth/osa/deployment/widget/#full-configuration

@tsbinns
Copy link
Contributor

tsbinns commented Feb 28, 2026

@neuromechanist Thanks for opening this PR. Pushing AI tools is not something that everyone has been fully comfortable with. See for instance this discussion regarding an MCP: #13288 (comment).

If I undestand, OSA is more of a chatbot, but still those concerns over the correctness of the bot's answers would apply. One of the suggested prompts 'How do I filter and preprocess my data?' is such an enormous and nuanced question. Others may feel differently, but that's my concern.

And yes, while nobody has to use it, it's inclusion in the documentation is still implicit support that it's answers can be trusted.

@larsoner
Copy link
Member

larsoner commented Mar 2, 2026

I tried it and it failed, but the chat icon in the lower right was at least discoverable enough

image image

Maybe this is an issue with it being on CircleCI, not sure...

Would be cool to be able to test here by clicking the CircleCI link, as then we could look into how reasonable its responses are.

And yes, while nobody has to use it, it's inclusion in the documentation is still implicit support that it's answers can be trusted.

I wonder if part of this we could adjust the widget to have a nice clear warning that it's AI generated and answers may or may not be correct. Maybe with a link to read more. From talking to people in the community, people are already "going to ChatGPT for help" so if we can get OSA to be at least as good or better than this, it might be a step in a helpful direction for our users. In other words, try to express: "Hey it's not clear this is a good idea but if you're going to use AI tools, we tried to make this one accurate." I suspect because @neuromechanist here has carefully chosen what to ingest to teach it (e.g., our docs, published papers) it probably does some reasonable things, hopefully moreso than ChatGPT etc. As maintainers we probably do need to discuss more over zoom or similar at some point though.

@neuromechanist is this used by other large scientific projects so far that you know of? For example if NumPy or somebody used it, would help me at least trust it a bit more.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO:

  • looks like an entry is still missing in doc/changes/names.inc
  • changelog entry should use :newcontrib: role
  • before I'm comfortable making this live, we should figure out how to add a prominent caveat (something like "this is AI, no guarantee it's accurate, please double-check against our docs")
  • Before this goes live, I think we need to decide whether we're committing to this long-term (and thus committing to Yahya's suggested monthly cost-sharing) or just trying it out. If we're just trying it out, that should be prominently stated in the caveat mentioned above (so users don't get as mad if we turn it off later).
  • if we can't get it working on circleCI, someone will need to do a local doc build and get it working there in order to test it out I guess

@tsbinns FWIW, I'm comfortable making it easier for end users to get good mne-related results from AI for their own scripts. To me that's a separate problem from whether we allow AI-aided contributions to the codebase.

@larsoner
Copy link
Member

larsoner commented Mar 2, 2026

changelog entry should use :newcontrib: role

@drammock I think there was a contrib back in 1.7, see #13702 (comment)

@neuromechanist
Copy link
Contributor Author

This was a suggestion in an email to @drammock, @larsoner, @agramfort with some more background.

Probably, it is useful for everyone to put the email here as well

Hi Dan,

You might have heard that I've been building the Open Science Assistant (OSA) platform and have already onboarded EEGLAB, NEMAR, HED, BIDS, and FieldTrip. I went ahead and created one for MNE-Python as well. You can try it here:

Live demo: https://demo.osc.earth/mne

What it knows

The assistant has a continuously synced knowledge base with:

  • Documentation: 30+ tutorials and guides from mne.tools, auto-fetched and converted to markdown
  • Codebase: 7,200+ function/class docstrings from MNE-Python, MNE-BIDS, MNE-Connectivity, MNE-ICALabel, and MNE-LSL
  • GitHub: Issues and PRs from all 5 repos, synced daily
  • Forum: Topics and answers from mne.discourse.group (~6,000 topics, sync in progress)
  • Papers: 1,100+ academic papers related to MNE, including citing papers for the core publications

It understands the MNE data pipeline (Raw -> Epochs -> Evoked -> SourceEstimate), cites sources with links, and provides concise answers.

Let me know if you want to onboard it and add it to your docs. There are some requirements, outlined below.

Embedding on mne.tools

If you want to add it to your website:

<script src="https://demo.osc.earth/osa-chat-widget.js"></script>
<script>
  OSAChatWidget.setConfig({
    communityId: 'mne'
  });
</script>

Floating chat button, bottom-right corner. Auto-detects the page URL for context-aware answers. Lightweight (~30KB), no dependencies, supports dark mode, works on mobile.

You can also give it page-specific context:

<script>
  OSAChatWidget.setConfig({
    communityId: 'mne',
    widgetInstructions: 'The user is reading the ICA tutorial.'
  });
</script>

Customizing the assistant

The whole configuration is one YAML file:

src/assistants/mne/config.yaml

You manage it via PRs to our repo. You can change the system prompt, which docs are preloaded, widget appearance, suggested questions, CORS origins for mne.tools, sync schedule, etc. The schema reference documents all options.

To enable the widget on mne.tools, uncomment the CORS section in the config:

cors_origins:
  - https://mne.tools
  - https://*.mne.tools

What you'd need

  • API key: You provide an OpenRouter API key. This gives you control over costs and model selection. All MNE assistant requests route through your key.
  • Configuration: You manage the config via PRs. Change whatever you need.
  • We handle: Server infrastructure, knowledge sync, and platform maintenance.

The demo currently uses a shared platform key, so feel free to try it out. For production use on mne.tools, you'd set up your own OpenRouter key.

Why it matters even if you don't embed it

Even if you decide not to add the widget to mne.tools, having these MNE resources synced and indexed is valuable for the broader ecosystem. I'm planning to build an arena where a question to one assistant (say, the BIDS assistant about data preparation) can also query the MNE-Python, EEGLAB, FieldTrip, and other onboarded assistants, so the response is much richer for every community. The more knowledge sources we have connected, the better the answers get across the board.

Let me know what you think.

Would be cool to be able to test here by clicking the CircleCI link, as then we could look into how reasonable its responses are.

Yes, CircleCI cannot access the backend at least as of now. This is a Protective precaution and can be controlled with the CORS flag (please look at the email and also the documentation).
Still, you can test the responses from https://demo.osc.earth/mne.
If you want to add CircleCI to the allowed domains, you could under CORS settings for MNE. But I would recommend to make it very specific to your CircleCI to prevent bots and unrelated users using up tokens.

I wonder if part of this we could adjust the widget to have a nice clear warning that it's AI generated and answers may or may not be correct. Maybe with a link to read more.

Thanks, very good and necessary suggestion, see: OpenScience-Collective/osa#245, and will be added today.

I suspect because @neuromechanist here has carefully chosen what to ingest to teach it (e.g., our docs, published papers) it probably does some reasonable things

Thanks, I barely tried to experiment with the prompt and knowledge sources for MNE. Please note that the responses are as good as the information and the prompts (i.e., how to use this information). I appreciate if the community gets involved in tuning the responses. Note that creating ad-hoc backends is planned (OpenScience-Collective/osa#219), For now, if you want to see the effect of your changes to the YAML file, you need to merge into the develop and wait about 10 to 15 minutes, so it deploys to the backend. I can create a team for MNE, so you can merge into dev w/o approval for MNE Assistant directory.

Having said that, one advantage of OSA is that it can parse information from many resources that a project has (PR, Issues, Docstrings, and even the Discourse sever) look at the details at https://status.osc.earth/osa/mne:
image

@neuromechanist is this used by other large scientific projects so far that you know of? For example if NumPy or somebody used it, would help me at least trust it a bit more.

Not yet, but I am probably presenting OSA at an Open Science Conference next month (if I can make it), organized by a couple of these large communities. A couple of other neuroscience communities have reached out and I am working with them to be onboarded.

If I undestand, OSA is more of a chatbot, but still those concerns over the correctness of the bot's answers would apply.

Yes, it is a chatbot with the goal to provide as much transparency as possible on how it is designed, what information it is using while it is easy to implement for already stretched open source maintainers (basic tools only require one YAML file to setup). Any use of AI comes with its own concerns. Yet it seems that we inevitably use AI or AI products on a daily basis. Whether we choose to be an active participant or a user of these products is up to us for sure.

One of the suggested prompts 'How do I filter and preprocess my data?' is such an enormous and nuanced question.

Similar to almost all parts of OSA, questions can also be adjusted. Probably even easier than most because you can change the questions from the widget script, and even make is specific to the page it is serving (similarly, you can amend the prompt based on the page as well).

Co-authored-by: Daniel McCloy <dan@mccloy.info>
@neuromechanist
Copy link
Contributor Author

neuromechanist commented Mar 2, 2026

The widget now has a disclaimer baked in for all communities after merged into prod (still need to add an option to remove/adjust it more easily). Thanks @larsoner

You can also add or customize the welcome message and add more caution if needed. Within the message by updating the Yaml file, or more easily add initialMessage field to the widget snippet (see docs):

https://feature-issue-245-widget-foo-demo.osc.earth/mne

image

@scott-huberty
Copy link
Contributor

I appreciate if the community gets involved in tuning the responses. Note that creating ad-hoc backends is planned (OpenScience-Collective/osa#219), For now, if you want to see the effect of your changes to the YAML file, you need to merge into the develop and wait about 10 to 15 minutes, so it deploys to the backend. I can create a team for MNE, so you can merge into dev w/o approval for MNE Assistant directory.

I imagine that if MNE were to adopt this bot, folks here would be pretty motivated to make sure that the responses are correct, and to make fixes upstream as needed! I'd be interested to learn how difficult it is to get a development environment set up for debugging issues and implementing fixes.

For example, I just asked the MNE-Python assistant:

"How can I convert the units of eyetracking data from pixels-on-screen to radians of visual angle using MNE-Python?",

and the bot told me "MNE-Python doesn't have built-in functions specifically for converting eyetracking pixels to visual angle"

But we do have a function exactly for this, that is documented and used in 3 tutorials: mne.preprocessing.eyetracking.convert_units

If I have some time in the coming weeks, maybe I can explore how feasible it is to fix/fine-tune this response.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 5, 2026

Related to this discussion, we might also want to consider Kapa AI, who offer a similar service which can be free of charge for OSS (https://docs.kapa.ai/kapa-for-open-source). I've used their chatbot on https://docs.pola.rs/api/python/stable/reference/index.html and https://dplyr.tidyverse.org and so far always received very good answers. They also claim that if the bot doesn't have any sources from the docs to back up an answer, it will just say "I don't know" instead of hallucinating something.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 5, 2026

Just to add one more example, I tried the OSA chatbot with the following question:

How do I parse the status channel of a BDF file correctly?

And the answer was completely wrong:

BDF Status Channel Handling
The key point: MNE-Python's read_raw_bdf() automatically handles the BDF status channel for you. You don't need to manually parse it in most cases.

What happens automatically:

The BDF status channel (the last channel in BDF files) is read and processed
Trigger/event information encoded in the status channel is extracted and converted to MNE `Annotations`
These annotations are accessible via `raw.annotations`
Basic usage:

import mne

raw = mne.io.read_raw_bdf('data.bdf', preload=True)

# Access extracted events/triggers from status channel
print(raw.annotations)

# Find events for analysis
events = mne.find_events(raw)
If you need to work with the raw status data:

You can access the status channel directly as a regular channel:

# Get the status channel data
status_data = raw.get_data(picks='STI 014')  # or whatever the status channel is named

# Decode status bits if needed
# BDF status channel encodes triggers in specific bit positions
For more control:

Check the read_raw_bdf() parameters for options like stim_channel to specify how triggers are handled.

The correct answer is that the status channel is not parsed automatically and mne.find_events() must be used, something that is very clear from the API docs (https://mne.tools/stable/generated/mne.io.read_raw_bdf.html).

@hoechenberger
Copy link
Member

hoechenberger commented Mar 5, 2026

+1 on exploring Kapa.

Trying to test use cases and "tune" the AI to provide the correct responses is nothing we should have to do. There's systems out there that provide useful responses automatically, and we should evaluate those instead.

@neuromechanist I appreciate your effort, this is certainly pushing in the right direction and sparking important discussions! I just believe that this concrete solution here is not quite it (yet!)

@neuromechanist
Copy link
Contributor Author

I greatly appreciate all the feedback here.

Regardless of whether MNE adopts the widget, we will maintain the MNE knowledge base on OSA. It serves users across other onboarded communities (BIDS, NEMAR, EEGLAB, FieldTrip, HED) (see OpenScience-Collective/osa#167) who may have MNE-related questions. Having an MNE resource would benefit the larger neuroscience community.

On Kapa AI

@cbrnr @hoechenberger evaluating Kapa alongside OSA makes sense. Key differences: OSA is fully open source (MIT), community-owned config (single YAML file), knowledge sources include GitHub issues/PRs, Discourse, and academic papers (not just docs), and a cross-community layer is planned where asking BIDS a question can also query MNE, EEGLAB, and FieldTrip. Kappa also scrapes docs, GitHub, and Discourse. I did not find if they have infrastructure for traversing citing papers, or adding specific tools like a BIDS validator. I also did not find any mention on how long they commit to provide this service for free. That said, evaluate both and pick what works best for MNE.

On commitment and cost

@drammock I'd suggest framing this as a trial. The widget can be removed with a single commit, no lock-in 🤓. SCCN is happy to cover costs as part of our mission to advance neuroscience tools and our grants outreach. No financial commitment required from MNE. If you later want direct cost control, BYOK (Bring Your Own Key) is available.

On the BDF status channel issue (@cbrnr)

Knowing that a status channel is different from TAL and trigger is a great insight (which the docs also assume the user knows; TAL is not even defined in the doc, but used). I also thought parsing status is how to handle events from BioSemi (despite working with this system for the last decade). It is only reading through the last paragraph of the docs that I saw TAL is parsed automatically, but status is NOT.

What the LLM actually did: 1) identified the premise, 2) searched the function docs, 3) found read_raw_bdf() documentation. Where it failed: it conflated general BDF event handling (TAL, which IS automatic) with the specific status channel (which is NOT). The word "status" is overloaded. We are not asking a BioSemi expert; we are asking a tool to look at the docs.

To test this, ask the same question with status in backticks:

How do I parse the status channel of a BDF file correctly?

Based on the documentation, here's what you need to know about the status channel in BDF files:

Status channel handling: The status channel in BDF files (typically the last channel) contains trigger/event information encoded as bit flags. MNE reads this automatically, but you need to extract events properly:

import mne
raw = mne.io.read_raw_bdf('your_file.bdf', preload=True)
events = mne.find_events(raw, stim_channel='Status')
  1. Channel naming: The status channel is typically named 'Status' in BioSemi BDF files
  2. Bit decoding: The status channel uses bit flags to encode multiple triggers simultaneously
  3. Re-referencing: BioSemi doesn't perform automatic common-mode rejection, so set an appropriate reference after loading

Should the assistant have assumed this was a specific status channel question? Probably yes. Does this make it completely wrong? I don't think so; it found the right docs, the reasoning was traceable. Can it be improved? Absolutely. The question is whether you want to have a say in when and how.

On the eyetracking issue (@scott-huberty)

Trickier. Even rephrased, the assistant says MNE doesn't support it. But asking about the specific function works, confirming the knowledge IS in the database; retrieval just didn't reach it.

Do you know what this does: mne.preprocessing.eyetracking.convert_units

convert_units converts eye-gaze data between pixels and radians of visual angle (or vice versa).

  • Purpose: Transform eye-tracking measurements from one unit system to another
  • Supported conversions: pixels <-> radians of visual angle
How can I convert the units of eyetracking data from pixels-on-screen to radians of visual angle?
from mne.preprocessing.eyetracking import convert_units

raw_converted = convert_units(raw,
                              screen_distance=0.6,
                              screen_resolution=(1920, 1080),
                              screen_size=(0.53, 0.30))

Root cause: the system prompt lists MEG, EEG, sEEG, ECoG, and NIRS but omits eyetracking. The assistant assumed MNE doesn't support it instead of searching first.

Fixes and next steps

Both issues are diagnosable and fixable:

  1. Query expansion for ambiguous questions (#249)
  2. Prompt improvements: supported data types, uncertainty handling, eyetracking docs (#250)
  3. Happy to create an MNE team on the OSA repo so maintainers can iterate on the config directly

@scott-huberty on contributing: edit src/assistants/mne/config.yaml, merge to develop, and the dev server auto-deploys in ~10 minutes. Full local dev is still rough (requires synced databases); ephemeral preview backends are on the roadmap (#219). Happy to help you get started.

Whatever MNE decides, the feedback here is valuable and will improve OSA across all communities. Thanks again.

@drammock
Copy link
Member

drammock commented Mar 5, 2026

thank you for the very detailed response @neuromechanist. I took a few minutes to look at Kapa's and OSA's websites. Here are my "hot takes":

  1. I don't think we know yet whether Kapa will be any better. If the problem is that our docs are somehow misleading to a RAG-based system (is OSA also RAG-based? wasn't sure after reading https://docs.osc.earth/osa/), presumably both systems will choke on the same set of questions.
  2. either of these tools (OSA or Kapa) provides an opportunity for us to detect where are docs are misleading or unclear and make targeted improvements to the docs. RAGs ≠ humans, but it seems safe to say that improvements to help RAGs get the right answer probably will help human readers too.
  3. It would be great if some of our maintainers and/or community members were willing to generate a bunch of questions and find out where the tool(s) make mistakes, and then see what changes to the docs are needed to get them to succeed.
  4. It would be great if there were a way to easily flag question-answer pairs as incorrect (and get a data dump of them), to facilitate the kind of failed-question-driven doc improvements mentioned above.
  5. One thing Kapa offers out-of-the-box is a discord integration. IDK if our users would want that (or even prefer it to a web-based widget), but it would be a way to test it out in a slightly less public way. One could even test OSA on the website (or command line) and Kapa in Discord, to compare results more easily.
  6. Kapa doesn't have a Discourse integration, but they do expose an API for building custom integrations. I can imagine one day having a Discourse integration that would auto-answer Forum questions (provided we made it clear that there was a way to "escalate this question to a human" / disable further AI responses on the same thread once it's demonstrated that it can't give a correct answer to the given question).
  7. I like the fact that OSA is community-developed, and that we would have the ability to improve it's performance for our community by tweaking the tool itself (not just by tweaking our docs so it doesn't get misled).
  8. Ultimately I think we'll want to extend the scope to include MNE-BIDS, MNE-LSL, MNE-Connectivity... I'm not sure whether this points in favor of a particular tool or not.

For me, the most pressing question is: who is willing to do the work to test out one or both of these tools, and eventually declare themselves satisfied enough to recommend that we adopt it? I'm not in favor of unleashing an unvetted tool onto our users and hoping for the best.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 6, 2026

Thanks @neuromechanist, I really appreciate your answer and your effort! For reference, I asked Claude Sonnet 4.6 (this is not even their best model, but it had the context of the MNE-Python source repository) the same question, and it got it right perfectly. If a model is so brittle that it cannot understand the difference between status (and its connection to TAL and annotations) and status, then it is not a good model. Tweaking every single reply that does not produce good answers is not a viable solution, this would be an immense amount of work that at least I am not willing to invest when there are other solutions out there that just work.

For the sake of completeness, here's Claude's answer:

Details Here's how to correctly parse the **Status channel** of a BDF file (typically produced by Biosemi ActiveTwo systems) in MNE-Python:

1. Reading the file

The 'Status' channel (named 'status' or 'trigger', case-insensitive) is automatically detected as a STIM channel when you use stim_channel='auto' (the default):

import mne
raw = mne.io.read_raw_bdf('my_file.bdf', preload=True)

See edf.py — the _check_stim_channel function (around edf.py) uses ["status", "trigger"] as default STIM channel names.


2. The key bit-layout problem

On Biosemi devices, the 24-bit Status channel has this layout:

Bits Content
0–15 Trigger codes from your stimulation device
16–23 System codes (e.g. CMS in/out-of-range, battery low, speed mode)

If you call mne.find_events() without masking, the system-code bits will corrupt your event values. The edf.py shows these solutions:


3. Masking trigger bits (bits 0–15)

# Approach 1: mask post-hoc
events = mne.find_events(raw)
events[:, 2] &= (2**16 - 1)  # keep only bits 0–15

# Approach 2: use the mask parameter directly
events = mne.find_events(raw, mask=2**16 - 1)

4. Checking system codes (bits 16–23)

events = mne.find_events(raw)

# Example: check the CMS-in-range bit (bit 20)
cms_bit = 20
cms_high = (events[:, 2] & (1 << cms_bit)) != 0

5. Event shift (device-dependent)

Some triggering devices encode events offset by 8 bits; if you get unexpected event codes, try:

events[:, 2] >>= 8

6. BDF Annotations (TAL)

If your BDF file contains a 'BDF Annotations' channel (Text Annotation Layer), MNE parses it automatically into raw.annotations. Convert to events with:

events, event_id = mne.events_from_annotations(raw)

Summary: The most common mistake is forgetting to mask out bits 16–23 (system codes). Always use mask=2**16 - 1 in mne.find_events() unless you specifically need the system bits.

I agree with most of Dan's takes. I think the next step for us should be to test Kapa, as we don't know how their model performs (although I would be surprised if it wasn't substantially better). I could take care of the application, and then everybody can test their chatbot and provide feedback (as I assume they will provide a test implementation of their product). I would clearly state that we want to evaluate if their product is a worthwile addition for us, so I would not commit to integrating their chatbot without a thorough testing phase.

@neuromechanist
Copy link
Contributor Author

neuromechanist commented Mar 6, 2026

is OSA also RAG-based?

Not exactly (at least in the traditional sense), that is why the docs does not claim it. For the common instructions, it loads all the instructions, for function, PR, discourse, paper citations, it queries the database. The assumption is that RAG (where usually some embedding vector is used and docs are chunked, etc) may not benefit our usually short documents (<10k words/tokens per doc).

For reference, I asked Claude Sonnet 4.6

Nice reference. If you were to ask the same question from OSA with Sonnet 4.6, it would give you a similar answer:

How do I parse the status channel of a BDF file correctly? ## Parsing the BDF Status Channel

In BioSemi BDF files, the Status channel encodes trigger/event information as a 24-bit integer. The key is applying the correct bitmask when extracting events.

The Core Issue: Bitmask

The Status channel contains more than just trigger codes — bits 16–23 encode system flags. You typically only want the lower 16 bits (or lower 8 bits for simple triggers):

import mne

raw = mne.io.read_raw_bdf('your_file.bdf', preload=True)

# Find events, masking to the lower 16 bits
events = mne.find_events(raw, stim_channel='Status', mask=0xFFFF)

# Or if only 8-bit trigger codes are used:
events = mne.find_events(raw, stim_channel='Status', mask=0xFF)

Why This Matters

Without the mask, spurious "events" appear because status bits (like the "new epoch" bit at position 20) toggle independently of your triggers.

Additional Options

  • mask_type='and' (default) — keeps only bits matching the mask
  • shortest_event=1 — catches very short pulses (default is 1 sample)
  • uint_cast=True — useful if you get negative event values due to sign issues with the 24-bit data
events = mne.find_events(raw, stim_channel='Status',
                         mask=0xFFFF, uint_cast=True)

Follow-up questions to consider:

  1. Are you getting unexpected/duplicate events, or no events at all?
  2. Do you know whether your triggers use 8-bit or 16-bit codes?
  3. Do you need help epoching the data after extracting events?

This answer cost 8 cents, used about 15k tokens (with three tool calls, using cache with 90% discount on the tool calls). In contrast, using Haiku 4.5 (which is the default model for OSA) cost 2 cents. Part of making comparisons is to control for the test parameters. The model, context amount, incurred cost and caching, all change the calculus.

@drammock, @scott-huberty, A complication for OSA for development has been access to the DB (creating new ones are time consuming, usually requires setting up API keys, environment, etc). This will be resolved in the upcoming 0.7.2. With OSA CLI, you will be able to interact with the DB while running the LLM calls locally (even using local LLMs if you like), test tuning instructions, mirror the DB and get write access to the mirror to tweak the DB, or even clone the DB for local use.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 6, 2026

This answer cost 8 cents, used about 15k tokens (with three tool calls, using cache with 90% discount on the tool calls). In contrast, using Haiku 4.5 (which is the default model for OSA) cost 2 cents. Part of making comparisons is to control for the test parameters. The model, context amount, incurred cost and caching, all change the calculus.

This explains a lot! I didn't know the default model, and obviously Haiku is a lot worse than Sonnet. I agree that we need to control for all these parameters in a comparison, but at the end of the day what matters is the quality of the answers, and it seems like a better base model than Haiku 4.5 is necessary. If we can use Sonnet 4.6 (I only found 4.5 in the options, but there shouldn't be that much difference) - that's a whole different story!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants