Babies Learning Language: April 2026

Wednesday, April 22, 2026

Datapages for reusable (and pretty!) data sharing

[This post is joint with Mika Braginsky, a long-time collaborator on data-sharing and data-viz.]

Data sharing is both a critical scientific need and, increasingly, a mandate by many research funders. The FAIR principles – that data should be findable, accessible, interoperable, and reusable – are a critical guide to how data are shared. Yet even FAIR-compliant datasets in approved repositories are often shared in ad-hoc formats that are hard to reuse or to integrate with other data. In contrast, the most impactful datasets tend to be disseminated thoughtfully through dataset-specific or community-specific platforms. These “domain-specific data repositories” (this was our term from a previous blogpost!) create opportunities for creating data standards and ontologies that fit the needs of a particular community, research problem, or instrument type. They also allow opportunities for engagement through interactive visualizations. But custom repositories and pretty websites with nice visualizations are costly and complicated to create.

We are introducing a set of open-source tools and templates for easily creating datapages, interactive websites to disseminate data for broad reuse. Datapages are easy to deploy for a single project, but extensible enough to host large collections of related datasets. You can learn more and get started at https://datapages.github.io/.

Monday, April 20, 2026

Using AI to improve (not automate away) academic research

Everyone seems to be consumed with AI anxiety. Graduate students are wondering if they will be replaced by assistants, or if they themselves are using AI enough or using it "right". Researchers are wondering what it means to produce research if agents can write whole papers. Everyone is wondering how we will keep up with a literature that is moving ever faster.

Everyone is feeling the pressure to do *more*: do more projects, produce more papers, review more papers. This has already resulted in negative impacts on the research space, for example the problems that conferences have in getting quality, non-automated reviewing for the huge volume of submissions they receive.

We should think about what we can do that is *different.* We should try to use automation to be more efficient at the annoying parts of our jobs while leaving more time for discovering new knowledge. The key (fast-evolving, unresolved) issue is how AI models will change the frontier of what is scientifically possible. This varies from field to field and changes day by day, but my sense is that the rise of semi-autonomous agents will be very interesting for scaling up social and behavioral science.