OSGAI

Workshop on Open-Source Generative AI

OSGAI is an NSF sponsored workshop that will bring together technical leaders in open-source AI and research with the goal of defining and addressing core challenges of Generative AI. The focus will be on the technical issues that are specific to open AI systems, with the goal of targeting and defining impactful areas for study that are currently not being address in academic venues. These include:

How can we improve approaches that facilitate adaptation of AI for a broader range of users and use-cases?
How can we further develop an open ecosystem for open data curation and human feedback on generative AI?
How can develop evaluations to allow value-driven open-source organizations ensure ethical, safe, and accurate systems?
How can we create tooling to allow open-source organizations to build AI models in a decentralized manner?

Details

The workshop will be a small group of researchers and developers interested in defining the future challenges of open-source generative AI.

The event will be hosted at Cornell Tech in New York City on March 25-26, 2024. Two-nights of lodging and travel for participants will be supported by the workshop.

Schedule

Time	Session	Speaker	Topic
9:00	Welcome	Sasha Rush	OSGAI
9:10	Introduction	Jeffrey Stanton, NSF	NSF POSE and Broader Goals
9:30	Session	Hanna Hajishirzi	OLMo: Accelerating the Science of Language Modeling
10:00		Stella Biderman	“Ethical" != "Closed": Building a Better World in the Open
10:30	Coffee Break
11:00		Sara Hooker
11:30	Session	John Cook	Building an ecosystem, Not a Monolith
		Tim Dettmers	Competitive advantages of open vs closed-source.
		Kathleen Kenealy	Gemma
		Eugene Cheah	Building a roadmap, from idea to large-scale model training
		Leshem Choshen	Wiki-models through Natural Feedback
		Ying Sheng	Bridging human and LLM systems: present and future
		Hector Liu	From open source to collaborative LLM research
12:45	Lunch
1:30	Group walk
2:30		Peter Henderson	What are the possibilities and limits for safety in open-source foundation models?
3:00		Daphne Ippolito	Data Curation is not One-Size-Fits-All
3:30	Breakout
4:30	Breakout: Reporting

9:00	Session	Ludwig Schmidt	"Open source AI for multimodality: OpenCLIP, LAION, and DataComp
9:30		Hao Zhang	Some reflections after running Chatbot Arena for 1 year
10:00		Tatsunori Hashimoto	Lessons learned from the Alpaca project
10:30	Coffee Break
11:00	Session	Greg Leppert
		Emma Strubell	Economics of Open Source
		Yacine Jernite	The roles of data access and transparency
		Swabha Swayamdipta	When all you have are Logits… Towards (Closed-Source) LLM Accountability via Logit Signatures.
		Danqi Chen	TBD
		Louis Castricato	RLAIF, user autonomy, and controllability
		Irina Rish	Continual Training of Foundation Models
12:30	Lunch
2:00	Session	Tegan Maharaj	What are the possibilities and limits for safety in open-source foundation models
2:30		Graham Neubig	Can we make building with open-source AI as simple as prompting ChatGPT?
3:00		Soumith Chintala	We need to create a sinkhole: fixing the post-training data problem for open-source models
3:30	Wrap-Up Discussion: Next Steps

Open-Source Generative AI

March 25-26, 2024

Workshop on Open-Source Generative AI

Details

Schedule

Organizers

Alexander Rush

Professor

Cornell Tech / @srush_nlp

Colin Raffel

Professor

University of Toronto