LLM Data Companion

This repository contains companion data sets for LLM publications.

AI-Assisted Ideation

Companion data to Randomness Is All You Need: Semantic Traversal of Problem-Solution Spaces with Large Language Models

Please cite that paper if making use of this data. E.g. using the following bibtex snippet:

@article{sandholm2024,
  title={{Randomness Is All You Need: Semantic Traversal of Problem-Solution Spaces with Large Language Models}},
  author={Thomas Sandholm and Sayandev Mukherjee and Bernardo A. Huberman},
  journal={arXiv preprint arXiv:2402.06053},
  year={2024}
}

The generated data dumps are available in /aidea.

They are organize by original problem statement that generated them:

Data	Problem Statement
timeline	Software project timelines are often underestimated, which leads to high costs.
employee	It is difficult to measure employee satisfaction in an unbiased way.
startup	It is not easy for early startups to find a customer base willing to try new technology.
data	Companies struggle with gaining insights from large volumes and high velocity of data.
satisfaction	It is hard to track and measure customer satisfaction across large geographies.
invest	It is difficult to plan investments in an uncertain economy.
innovation	It is difficult to create innovation opportunities without introducing too much process and hampering creativity.
talent	Retaining high-performing talent is hard in competitive emerging markets.
ml	Large machine learning models are expensive and time consuming to train.
privacy	Ensuring privacy of customers is difficult while leveraging their data for business insights.

Problems generated from solutions have the prefix gprob. The solution they are generated from has the prefix gsol.

The file naming convention is:

<type>.<index>.<temperature>.txt

where type can be gsol, gprob, sol or prob for solutions for generated problems, generated problems, solutions and problems respectively. The index denotes the order in which the solution was generated, starting with solution 1 which is the solution to the original problem. The ordering is determined by a depth-first search of the related problem and generated problem tree. The temperature is the LLM temperature set during solution and problem generation from a prompt. The temperatures used include 0.5,0.6,0.7,0.8, 0.9, 1.0, and 1.1. The actual temperature fed into the LLM is a uniform random number in the interval [temp,temp + 0.1].

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
aidea		aidea
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Data Companion

AI-Assisted Ideation

About

Releases

Packages

License

cablelabs/llmdata

Folders and files

Latest commit

History

Repository files navigation

LLM Data Companion

AI-Assisted Ideation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Packages