Skip to main content

ChatGPT Universe

This list is also at https://github.com/cedrickchee/chatgpt-universe

This tiny place of the Web stores a growing collection of interesting things about ChatGPT and GPT-3 from OpenAI.

I want an all-in-one place to keep things about ChatGPT. So, I hand-curated this list with the help of others (acknowleged below).

The collections are not limited to only the best resources, tools, examples, demos, hacks, apps, and usages of ChatGPT.


The following resources started off based on awesome-chatgpt lists12 but with my own modifications:

General Resources

ChatGPT Community / Discussion

Examples

Example prompts.

Experiments

Blog Posts and Articles

Prompt Engineering

Wanted: Prompt engineer. Minimum 10 years prompt engineering experience. #hiring #joke

Prompt engineering is dead, long live dialogue engineering. — VP Product, OpenAI

Reason:

Why does ChatGPT work so well? Is it "just scaling up GPT-3" under the hood? In this 🧵, let's discuss the "Instruct" paradigm, its deep technical insights, and a big implication: "prompt engineering" as we know it may likely disappear soon. Source: https://archive.is/dqHI8

  • Learn Prompting - This website is a free, open-source guide on prompt engineering.
  • PromptArray - A prompting language for neural text generators.
  • PromptLayer is a tool for prompt engineers - Maintain a log of your prompts and OpenAI API requests. Track, debug, and replay old completions. Build prompts through trial and exploration.

Examples

Papers

Educational

Videos

Tweets

Books

Development

Unofficial API and SDK.

  • rawandahmad698/PyChatGPT (Python) - Lightweight, TLS-Based API on your CLI without requiring a browser or access token.
  • acheong08/ChatGPT (Python) - Lightweight package for interacting with ChatGPT's API by OpenAI. Uses reverse engineered official API.
  • transitive-bullshit/chatgpt-api (Node.js) - Node.js client for the unofficial ChatGPT API and using a headless browser.
  • ChatGPT-MS - Multi-Session ChatGPT API. The main code is copied from PyChatGPT.

Tools

  • safer-prompt-evaluator - This shows the results from using a second, filter LLM that analyses prompts before sending them to ChatGPT.
  • Dust - Design and deploy large language model (LLM) apps. Generative models app specification and execution engine. Prompt engineering, re-imagined with one goal, help accelerate LLMs deployment.
  • LangChain - Building applications with LLMs through composability.

Training Data

  • LAION LLM - Gathering Data for, training and sharing of a LAION Large Language Models (LLLM). The group is still writing a tech proposal of FlanT5-Atlas architecture (or poor man's ChatGPT@Home).
  • open-chatgpt-prompt-collective by Surface Data Collective - A website to generate prompts for training an Open ChatGPT model.
  • BigScience P3 dataset - P3 (Public Pool of Prompts) is a collection of prompted English datasets covering a diverse set of NLP tasks. (PromptSource, a toolkit for creating, sharing and using prompts)
  • Data Augmentation To Create Instructions Form Text - discussion on LAION's Discord. The key to creating a better FlanT5 (ChatGPT@Home).
  • WritingPrompts dataset by FAIR.
  • Templates for FLAN (Finetuned Language Models are Zero-Shot Learners)
  • OpenAI human-feedback dataset on the Hugging Face Hub - The dataset is from the "Learning to Summarize from Human Feedback" paper, where they trained an RLHF reward model for summarization.
  • In OpenAI's papers on GPT-2 and GPT-3.x, they mentioned references to these datasets:
    • Common Crawl
      • Number of Tokens: 410 billion
      • Weight in training mix: 60%
    • WebText2
      • An internet dataset created by scraping URLs extracted from Reddit submissions with a minimum score of 3 as a proxy for quality, deduplicated at the document level with MinHash
      • Number of Tokens: 19 billion
      • Weight in training mix: 20%
    • Books14
      • Number of Tokens: 12 billion
      • Weight in training mix: 8%
    • Books24
      • Number of Tokens: 55 billion
      • Weight in training mix: 8%
    • Wikipedia
      • Number of Tokens: 3 billion
      • Weight in training mix: 3%

Open Source ChatGPT

We want a ChatGPT alternative like Stable Diffusion.

Goals

  • Open source effort towards OpenAI's ChatGPT.
  • Reverse engineer and replicate ChatGPT models and training data.

Ultimate goal: self-hosted version of ChatGPT.

Lessons

Takeaways from EleutherAI one year retro (2021):

  • Access to enough compute/hardware/GPU alone won't help you succeed. You need:
    • a proper dataset (beyond the Pile and c4)
    • research expertise
    • engineering capabilities
    • a lot of hard work

Projects

  • FLAN-T5 XXL aka. ChatGPT@Home is a public model that has undergone instruction finetuning. XXL is a 11B model. It is currently the most comparable model against ChatGPT (InstructGPT models are initialized from GPT-3.x series (model card)). There are successful attempts deploying FLAN-T5 on GPU with 24 GB RAM with bitsandbytes-Int8 inference for Hugging Face models. You can run the model easily on a single machine, without performance degradation. This could be a game changer in enabling people outside of big tech companies being able to use these LLMs. Efforts are already underway to create a better FLAN-T5. The community (i.e., LAION) are working on FlanT5-Atlas architecture and a collection of prompted/instructions datasets.

  • Open-Assistant - Open-source ChatGPT replication by LAION, Yannic Kilcher et al. This project is meant to give everyone access to a great chat based large language model. (Open Assistant Live Coding with Yannic Kilcher (video)) High-level plans:

    Phase 1: Prompt collection for supervised finetuning (SFT) and to get the prompts for model generated completions/answers.

    Phase 2: Human feedback (e.g. ranking) of multiple outputs generated by the model. Example five model outputs are shown and the user should rank them from best to worst.

    Phase 3: Optimization with RLHF which we plan to do via TRLX. And then the we iterate with this new model again over phase 2 and phase 3 hopefully multiple times.

    Models will be trained on Summit supercomputer (~6 million NVIDIA V100 hrs per year) [source]

    More info, see the LAION LLM proposal (Google Doc) above.

    Note: Please see the GitHub repo for up-to-date info.

  • CarperAI/TRLX

    News (2023-01-13): They replicated OpenAI's Learning to Summarize paper using trlX library. [report]

  • lucidrains/PaLM-rlhf-pytorch - (WIP) Implementation of RLHF on top of the PaLM architecture. Basically ChatGPT but with PaLM. The developer plan to add retrieval functionality too, à la RETRO. [Tweet]

    News (2022-12-31): There's now an open source alternative to ChatGPT, but good luck running it - My comments: No it hasn't. This is NOT an actual trained model (no weights) you can use. This is just code for training a ChatGPT-like model. Furthermore, the training data (enwik8) is small.

    CarperAI's large scale RLHF-aligned model (TRLX) train with LAION's data is coming out early next year. (Source: Tweet)

  • allenai/RL4LMs - RL for language models (RL4LMs) by Allen AI. It's a modular RL library to fine-tune language models to human preferences.

  • GPT-JT - GPT-JT (6B) is a variant forked off GPT-J (6B), and performs exceptionally well on text classification and other tasks. On classification benchmarks such as RAFT, it comes close to state-of-the-art models that are much larger (e.g., InstructGPT davinci v2)!

  • LEAM (Large European AI Models) - The EU planning to fund the development of a large-scale ChatGPT-like model. [website, project documents (English, PDF), concept paper (German, PDF)]

  • /r/AiCrowdFund - A place just started (2023) where people can find a way to crowd fund (with GPUs) a large AI. I'm not sure whether they've seen Petals where you can run LLMs at home, BitTorrent‑style (federated learning?). It seems to be headed in that direction.

See cedrickchee/awesome-transformer-nlp for more info.

Browser Extensions

Use ChatGPT anywhere.

  • Chrome extension to access ChatGPT as a popup on any page
  • ChatGPT for Google - Chrome/Edge/Firefox extension to display ChatGPT response alongside Google Search results.
  • ChatGPT Everywhere - Chrome extension that adds ChatGPT to every text box on the internet. (demo)
  • Chrome extension - A really simple Chrome Extension (manifest v3) that you can access OpenAI's ChatGPT from anywhere on the web.
  • summarize.site - Chrome extension to summarize blogs and articles using ChatGPT.
  • WebChatGPT - ChatGPT with Internet access. A browser extension (Chrome and Firefos) that augments your ChatGPT prompts with relevant search results from the Web. (Remember, ChatGPT cannot access the Web and has limited knowledge of the world after 2021)
  • XP1 - GPT-based Assistant with access to your Tabs.
  • ExtractGPT - A browser extension for scraping data from structured & unstructured pages.

Access ChatGPT From Other Platforms

Bots

Command-Line Interface (CLI) Tools

  • chatgpt-conversation - Voice-based chatGPT.
  • Shell GPT - A CLI productivity tool powered by OpenAI's text-davinci-003 model, will help you accomplish your tasks faster and more efficiently.

Editors and IDEs

Others

  • RayCast Extension (unofficial) - Run ChatGPT through Raycast extension.
  • Google Docs - ChatGPT directly within Google Docs as an Editor Add-on.
  • GPT Index contains a toolkit of index data structures designed to easily connect LLM's with your external data.

Applications

Web applications.

  • ShareGPT - A web app for sharing your wildest ChatGPT conversations with one click. (demo)
  • LearnGPT - Share ChatGPT examples. See the best voted examples. Their goal is to create a resource for anyone who wants to learn more about ChatGPT.
  • ShowGPT - Show your ChatGPT prompts.
  • The search engine for developers, powered by large, proprietary AI language models.
  • GPTDuck – Ask questions about any GitHub repo.
  • LLM Garden - A number of experiments using GPT-3, delivered in a web app.

Desktop applications.

Infrastructure

Newsletters

AI Safety and Ethics

AI alignment and AI interpretability.

AGI and Humanity

  • AI for the Next Era - OpenAI's Sam Altman on the New Frontiers of AI.

    My comments: Reading this after the ChatGPT launch, mostly all the things that Sam is referring to in the interview contains reminiscences about predictions on AI and development from Ray Kurzweil.

  • Google won't launch ChatGPT rival because of 'reputational risk'

  • AI Alignment Forum is a single online hub for researchers to discuss all ideas related to ensuring that transformatively powerful AIs are aligned with human values. Discussion ranges from technical models of agency to the strategic landscape, and everything in between.

  • The Expanding Dark Forest and Generative AI by Maggie Appleton - Proving you're a human on a web flooded with generative AI content.

Tweets

ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness.

It's a mistake to be relying on it for anything important right now. It's a preview of progress; we have lots of work to do on robustness and truthfulness.

fun creative inspiration; great! reliance for factual queries; not such a good idea. — Sam Altman, OpenAI

News covering to that Tweet.

Applications and Tools

  • GPT-2 Output Detector [code] [demo]

    The @HuggingFace GPT detector works very well on ChatGPT-created text. I ran 5 student essays and 5 ChatGPT essays for the same prompt through it, and it was correct every time with >99.9% confidence. — @cfiesler

  • OpenAI's attempts to watermark AI text hit limits - Watermarking may allow for detection of AI text. This post discusses some of the limitations but suggests that it's worth pursuing. Prof. Scott Aaronson "expressed the belief that, if OpenAI can demonstrate that watermarking works and doesn't impact the quality of the generated text, it has the potential to become an industry standard.". OpenAI engineer Hendrik Kirchner built a working prototype.
    • Related: Scott Aaronson talks AI Safety on Nov 2022 (video) - GPT outputs will be statistically watermarked with a secret signal that you can use to proof the outputs came from GPT, making it much harder to take a GPT output and pass it off as if it came from a human. How it works, it selects the tokens pseudorandomly using cryptographic PRNG that secretly biases a certain score which you can also compute if you know the key for this PRNG. Scott doesn’t give too many details about how it works and he admits this can be defeated with enough effort, for example by using one AI to paraphrase another. But if you just insert or delete a few words or rearrange the order of some sentences, the signal will still be there. So it's robust against those sorts of interventions. Many suspect its possible to bypass using a clever decoding strategy. Scott is also researching: Planting Undetectable Backdoors in Machine Learning Models (2022 paper)". People are questioning whether they are missing something, or are all these attempts at recognising LLM outputs obviously destined to fail? I think they've clearly thought about this but still think this is useful (from transcript of the lecture: https://scottaaronson.blog/?p=6823).
  • GPTZero demo (Beta) hosted by Streamlit - An app that can quickly and efficiently detect whether an essay is ChatGPT or human written. [Tweet]
  • A Watermark for Large Language Models (paper) by University of Maryland (2023). It operates by maintaining a "whitelist" and "blacklist" of high-log probability words. [Tweet (explainer thread by one of the authors), code]
    • They test the watermark using a LLM from the Open Pretrained Transformer (OPT) family, and discuss robustness and security.
  • DetectGPT - Zero-Shot machine-generated text detection using probability curvature. [paper (2023), code, demo, and Twitter thread]
    • Method: language model output will minimize log-probability in token space. Because of this, to detect if text is generated by a language model, "perturb" the phrase slightly and measure the curvature in log-probability.
  • New AI classifier for indicating AI-written text by OpenAI. [try the classifier]
    • Results: correctly flags AI-generated text 26% of the time, incorrectly flags human-generated text 9% of the time.

LMOps

General technology for enabling AI capabilities with LLMs and generative AI models.


Demos

Demos3 and examples in the form of tweets:

Day 1, 2022

  1. Generating detailed prompts for text-to-image models like MidJourney & Stable Diffusion
  2. ChatGPT outperforming Google search
  3. Generating code for automated RPA, e.g. automating the click sequence for house search in Redfin
  4. Generating on-demand code contribution ideas for an about-to-be-fired Twitter employee
  5. An app builder such as essay automatic summarization
  6. Personal trainer and nutritionist: Generating a weight loss plan, complete with calorie targets, meal plans, a grocery list, and a workout plan
  7. Building a virtual machine inside ChatGPT
  8. Code debugging partner: explains and fixes bugs
See more
  1. Generating programmatic astrophoto processing by detecting constellations in an image
  2. VSCode extension that allows using ChatGPT within the context of a code
  3. Building web AR scenes by using text commands
  4. Stringing cloud services to perform complex tasks
  5. Generating legal contracts
  6. A Chrome extension that presents ChatGPT results next to Google Search
  7. Solving complex coding questions - the end of LeetCode?
  8. Solving complex academic assignments - the end of Chegg?
  9. Answering unanswered Stack Overflow questions - the end of Stack Overflow?
  10. Explaining complex regex without any context
  11. Generating hallucinated chat with a hallucinated person in a hallucinated chat room
  12. Bypassing OpenAI's restrictions by disclosing ChatGPT's belief system
  13. Uncovering ChatGPT's opinion of humans including a detailed destruction plan
  14. An insightful executive summary of ChatGPT
  15. Building e-commerce websites: stitching ChatGPT & Node script to automatically generate SEO-driven blog posts using GPT 3
  16. A ChatGPT extension that generates text, tweets, stories, and more for every website
  17. An extension that adds "Generate PNG" and "Export PDF" functions to ChatGPT's interface
  18. A thread showcasing ways of helping hackers by using ChatGPT
  19. Generating editorial pieces like sports articles
  20. Generating SEO titles to optimize sites Click Through Rate
  21. Creating social games. E.g. guess which city is featured in a picture
  22. A tutorial on how to use ChatGPT to create a wrapper R package
  23. ChatGPT can basically just generate AI art prompts. I asked a one-line question, and typed the answers verbatim straight into MidJourney and boom. Times are getting weird...
  24. A collection of wrong and failed results from ChatGPT
  25. Use the AWS TypeScript CDK to configure cloud infrastructure on AWS
  26. Seeing people trick ChatGPT into getting around the restrictions OpenAI placed on usage is like watching an Asimov novel come to life
  27. Never ever write a job description again
  28. ChatGPT is getting pretty close to replicating the Stack Overflow community already
  29. That's how I'll pick books in the future
  30. ChatGPT is amazing but OpenAI has not come close to addressing the problem of bias. Filters appear to be bypassed with simple tricks, and superficially masked
  31. i'm the ai now
  32. All the ways to get around ChatGPT's safeguards

2023

  1. Programming with ChatGPT. Some observations
  2. The best ways to use ChatGPT. 8 ways ChatGPT can save you thousands of hours in 2023
  3. Everyone’s using ChatGPT. Almost everyone's STUCK in beginner mode. 10 techniques to get massively ahead with AI (cut-and-paste these prompts)
  4. David Guetta uses ChatGPT and uberduck.ai to deepfake Eminem rap for DJ set

Others

Mostly found in GitHub Gist:

ChatGPT Alternatives

  • Perplexity - A new search interface that uses OpenAI GPT 3.5 and Microsoft Bing to directly answer any question you ask.
  • Bart from Google
  • Sparrow from DeepMind
  • YouChat
  • Poe from Quora
  • Bloom from BigScience
  • Character AI
  • Jasper Chat

Lightly based on publicly announced ChatGPT variants and competitors Tweet.


  1. https://github.com/humanloop/awesome-chatgpt
  2. https://github.com/Kamigami55/awesome-chatgpt
  3. A key component of GPT-3.5 models are Books1 and Books2. Books1 - aka BookCorpus, a free books scraped from smashwords.com. Books2 - We know very little about what this is, people suspect it's libgen, but it's purely conjecture. Nonetheless, books3 is "all of bibliotik".
  4. https://github.com/saharmor/awesome-chatgpt