Skip to content

Generative AI Examples v1.0 Release Notes

Compare
Choose a tag to compare
@kevinintel kevinintel released this 20 Sep 09:36
· 291 commits to main since this release

OPEA Release Notes v1.0

What’s New in OPEA v1.0

  • Highlights

    • Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
    • Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
    • Improve RAG with Knowledge Graph based on Neo4j
    • Improve VisualQnA and provide multi-modality RAG support
    • Faster microservice launch through removal of some dispatch overhead
    • Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
    • Enable HorizontalPodAutoscaler (HPA) for better resource management
    • Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
    • Further improvement on documentation and developer experience
  • Other features

    • Enable OpenAI compatible format on applicable microservices
    • Support microservice launch from ModelScope to address China ecosystem need
    • Support Red Hat OpenShift Container Platform (RHOCP)
    • Refactor the code and CI/CD pipeline to provide better support for contributors
    • Improve Docker versioning to avoid the potential conflict
    • Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
    • Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
  • Learn more about OPEA at

  • Release Documentation:

Details

GenAIExamples
  • Deployment

    • Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
    • K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
    • Update mount path in xeon k8s(2a6af64)
    • Add Nginx - k8s manifest in CodeTrans(6a679ba)
    • Add Nginx - docker in CodeTrans(cc84847)
    • watch more docker compose files changes(4b0bc26)
    • Add chatQnA UI manifest(758d236)
    • Revert the LLM model for kubernetes GMS(f5f1e32)
    • [ChatQnA] Update retrieval & dataprep manifests(6730b24)
    • [ChatQnA]Update manifests(3563f5d)
    • [ChatQnA] Update benchmarking manifests(36fb9a9)
    • [ChatQnA] udate OOB & Tuned manifests(ac34860)
    • Add nginx and UI to the ChatQnA manifest(05f9828)
    • [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
    • [Translation] Support manifests and nginx(1e13031)
    • update V1.0 benchmark manifest (e5affb9)
    • update image name(e2a74f7)
    • K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
    • Change megaservice path in line with new file structure(5ab27b6)
    • Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
    • Add chatQnA UI manifest(758d236)
    • Yaml: add comments to specify gaudi device ids.(63406dc)
    • add tgi bf16 setup on CPU k8s.(ba17031)
  • Documentation

    • [ChatQnA] Update README for ModelScope(aebc23f)
    • Update README.md(4bd7841)
    • [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
    • [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
    • Fix readme for nv gpu(43b2ae5)
    • [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
    • Refine ChatQnA README for TGI(afc3341)
    • Add default model for VisualQnA README(07baa8f)
    • Update readme for manifests of some examples(adb157f)
    • doc: use markdown table in supported_examples(9cf1d88)
    • doc: remove invalid code block language(c6d811a)
    • add AudioQnA readme with supported model(f4f4da2)
    • add more code owners(7f89797)
    • doc: fix headings(7a0fca7)
    • [Codegen] Refine readme to prompt users on how to change the model.(814164d)
    • Update README.md and remove some open-source details(2ef83fc)
    • Add issue template(84a781a)
    • doc: fix headings and indenting(67394b8)
    • Add default model in readme for FaqGen and DocSum(d487093)
    • Change docs of kubernetes for curl commands in README(4133757)
    • Update v0.9 RAG release data(947936e)
    • Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
    • Update docker images list.(a8244c4)
    • refactor the network port setting for AWS(bc81770)
    • Add validate microservice details link(bd811bd)
    • [ChatQnA] Add Nginx in Docker Compose and README(6c36448
    • [Doc] Update CodeGen and Translation READMEs(a09395e)
    • [Doc] Refine READMEs(372d78c)
    • Remove marketing materials(d85ec09)
    • doc PR to main instead of of v1.0r(dc94026)
    • Update README.md for Multiplatforms(b205dc7)
    • Refine the quick start of ChatQnA(3b70fb0)
    • Update supported_examples(96d5cd9)
    • [Doc] doc improvement(e0b3b57)
    • Fix README issues(bceacdc)
    • doc: fix broken image reference and markdown(d422929)
    • doc: give document meaningful title(a3fa0d6)
    • doc: fix incorrefine readme for reorg(d2bab99)
    • doc: fix incorrect path to png image files (d97882e)
    • update doc according to comments(f990f79)
    • doc: fix headings and indenting(67394b8)
    • Update README.md(4bd7841)
    • refine readme for reorg(d2bab99)
    • Update README with new examples(2d28beb)
    • README: fix broken links(ff6f841)
    • Update v0.9 RAG release data(947936e)
    • Update README.md of pdf file(87e51d5)
    • [ChatQnA] Update README for ModelScope(aebc23f)
    • Add table to list port, endpoint, framework, model, serving, and hardware for each microservice in ChatQnA(1a934af)
    • Update SearchQnA document and compose.yaml(5c67204)
    • Update invalid link(7b2194f)
    • AgentQnA: Fix erroneous link in the README(1144fae)
    • Fix Xeon reference per its trademark(e1b8ce0)
    • Provide the method to get nke-10k-2023.pdf(a2745b2)
    • adopted tech writing style(558ea3b)
    • Improve ChatQnA flowchat according to feedback(375ea7a)
    • Fix BACKEND_SERVICE_ENDPOINT variable value in the VideoQnA instructions(79e947e)
    • [Doc] Refine ChatQnA README(7eaab93)
  • Functionalities and Bug Fix

    • Fix refactor bug(7c13f2c)
    • Provide the method to get nke-10k-2023.pdf(a2745b2)
    • Integrate visualQnA backend(fa12083)
    • Enable nginx for VisualQnA(def19b4)
    • Add Settings and Update system Prompt option(1d1e1f9)
    • Refactor folder to support different vendors(d73129c)
    • Add rerank finetuning example(71857f5)
    • remove logs for benchmark(e0bc5f2)
    • update image build for 2 new examples(0869029)
    • fix comps/nginx image build content(22d066a)
    • react-ui: Add support to display Chinese(8c40204)
    • [VisualQnA] Update compose.yaml to fix the endpoint url issue in UI(fbaa024)
    • Add megaservice definition without microservice wrappers(ebe6b47)
    • Add instruction tuning example(4c78f8c)
    • fix token name(1e47444)
    • Modify the handling of detected warnings to only prompt.(e6f5d13)
    • Always upload scan artifacts(6f3e54a)
    • Update ChatQnA env (32afb65)
    • Yinghu5 patch 1(beda609)
    • Update ollama run command(10c81f1)
    • weekly update images tag(035f39f)
    • Fix port conflict in llava-tgi-service in VisualQnA(993688a)
    • Remove 'vim' from all Dockerfiles(1874dfd)
    • enhance image publish action(5fde666)
    • Update port in set_env.sh for TGI endpoint(e5ec38c)
    • move evaluation scripts(f04f061)
    • Handle uncontrolled data path for MultimodalQnA v1.0 release(872e93e)
    • Align parameters for "max_token, repetition_penalty,presence_penalty,frequency_penalty"(2f03a3a)
    • Remove useless folder.(88829c9)
    • Enable nginx for VisualQnA(def19b4)
    • Refactor folder to support different vendors(d73129c)
    • fix path bug for reorg(264759d)
    • fix reorg bug(504228e)
    • update image build for 2 new examples(0869029)
    • Add megaservice definition without microservice wrappers(ebe6b47)
    • Add hyperlinks picture paths validation.(0611707)
    • Added gaudi example for rerank model finetuning(edcc50f)
    • Add VideoRAGQnA as MMRAG usecase in Example(2dd69dc)
    • Agent example for v1.0 release(262a6f6)
    • Fix issues with the VisualQnA instructions (bc4bbfa)
    • Made cogen react ui to use runtime environment variables(b84c989)
    • add image build for new examples(3f2e7b7)
    • fix image build issue on push(88fde62)
    • Add Settings and Update system Prompt option(1d1e1f9)
    • [ChatQnA] Add no_wrapper benchmarking and update legacy manifests(06696c8)
    • ProviIntegrate visualQnA backend(fa12083)
    • Integrate visualQnA backend(fa12083)
    • Add imagePrompt to display default image hint(e48532e)
    • BUGFIX: rename videoragqna to videoqna to align with other examples(e102291)
    • Fix megaservice ulimit issue under high concurrency(4112fd0)
  • CI/CD/UT

    • Add new test cases for VisualQnA(995a62c)
    • docker image cd workflow enhance (675ea4a)
    • optimize image scan cd workflow(dba908a)
    • Refine code scan output and remove opea_release_data.md.(21e215c)
    • Fix other repo issue.(412a0b0)
    • [DocIndexRetriever] Add xeon test and fix gaudi test (62dbb6d)
    • watch more docker compose files' changes(4b0bc26)
    • fix typo in test script in AgentQnA(10fe3c6)
    • Fix InstructionTuning and RerankFinetuning tests(be8e283)
    • Fix issue(0bb0abb)
    • print image build test commit(3ce3955)
    • Fix SearchQnA tests bug(daf2a4f)
    • [ProductivitySuite] Fix CD Issue(d55a33d)
GenAIComps
  • Cores

    • Optimize mega flow by removing microservice wrapper(0bb69ac)
    • Fix guardrails out handle logics for space linebreak and quote(e38ed6d)
    • fix mismatched response format w/wo streaming guardrails(b6c0785)
  • Fine-tuning/Pre-training

    • Added finetuned model deployment tutorial in readme(2931147)
    • Add LLM pretraining support(58e9972)
    • updates to containers for finetuning composite(f4d123c)
    • enable embedding finetuning(7e1a2e5)
    • update finetuning doc(7d2cd6b)
    • Support rerank model finetuning(7d9265f)
    • remove Update checkpoint format(8369fbf)
    • finetuning models limitation.(a924579)
    • Update checkpoint format(8369fbf)
    • update upload_training_files format(3367b76)
    • refine logging code.(5b3053f)
    • Added finetuned model deployment tutorial in readme(2931147)
    • enable embedding finetuning(7e1a2e5)
  • LVM/Video RAG

    • Fix lvms videl-llama code issue(38abaab)
    • Fix LVM streaming issue(fb4b8d2)
    • Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)
    • Retriever and lvm update for multimodal rag on videos(1513998)
    • BUG FIX: LVM security fix(3e548f3)
    • Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)
    • Add local Rerank microservice for VideoRAGQnA(5fb4a38)
    • Add Megaservice support for MMRAG - MultimodalRAGQnAWithVideos usecase(99be1bd)
    • Bugfix for PR 496 to add format_video_name function(54aa943)
    • Prediction Guard LVM component(1249c4f)
    • Fix LVM streaming issue(fb4b8d2)
    • Fix lvms videl-llama code issue(38abaab)
    • Fix vLLM components images building(161c338)
    • Add schema to Redis initialization & Improve LVM-TGI For Multimodal Retriever Microservice(23cc3ea)
  • LLM/Rerank/Retrieval

    • fix vllm llamaindex stream bug(ca94c60)
    • Support Llama index for llms native(2e41dcf)
    • Prediction Guard LLM component(391c4a5)
    • update vllm to latest version for hpu(599a58f)
    • Align parameters for "max_token, repetition_penalty,presence_penalty,frequency_penalty"(3a31295)
    • optimize rerank with backend ref(d76751a)
    • add VDMS retriever microservice for v0.9 Milestone(445c9b1)
    • Fix the Retriever README error(1d761fa)
    • optimize rerank with backend ref(d76751a)
    • unify default reranking model with BAAI/bge-reranker-base(48d4e53)
    • Fix Ollama langchain upgrade issue(8adbcce)
    • vllm langchain: Add Document Retriever Support(0f2c2b1)
    • Support Llama index for vLLM(8e3f553)
    • Changes to comps/llms/text-generation/README(18092f3)
    • Fix security problem(a672569)
  • DataPrep/vector stores

    • Fix the loading error of jsonl file(2fbce3e)
    • To avoid port conflicts change port to others.(89197e5)
    • Dataprep fetch page fix(01886fe)
    • Multimodal dataprep(6d4b668)
    • Refine Dataprep Milvus MS(7686cfa)
    • dataprep: Fix issue in uploading docx with embedding image(b873cf8)
    • add: Pathway vector store and retriever as LangChain component(2c2322e)
    • adding lancedb to langchain vectorstores(2360e5a)
    • adding dataprep support for CLIP based models for VideoRAGQnA example for v1.0(f84d91a)
    • Fix the loading error of jsonl file(2fbce3e)
  • Other Components

    • Fix intent detection code issue(4c0f527)
    • clear some unnecessary scripts and Dockerfile commands.(824a7e2)
    • Update CODEOWNERS(5537b7f)
    • doc: fix heading levels in markdown content(a8a46bc)
    • [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)
    • unify default reranking model with BAAI/bge-reranker-base(48d4e53)
    • feedback_management: Remove 'vim' from Dockerfile(b2e64d2)
    • switch to using upstream 'tgi-gaudi' on HuggingFace(90cc44f)
    • Using Pip '--no-cache-dir' within all Dockerfiles(f1f866f)
    • Change image tag.(2093558)
    • add code owners(0379aeb)
    • Remove revision for TEI Embedding(d609071)
    • BUGFIX: fix SearchedMultimodalDoc in docarray(ed44b44)
    • Feedback management microservice component(72123b2)
    • bump version into v1.0(9a1af76)
    • Add Scan Container.(0d49244)
    • Remove 'vim' from all Dockerfiles(25174c0)
    • update image build yaml(b541fd8)
    • ollama: Update curl proxy.(f510b69)
    • Embedding Runtime on NeuralSpeed(0292355)
    • add microservice for intent detection(84a7e57)
    • Update README.md for Multiplatforms(ef90fbb)
    • doc: fix heading levels(f8f8854)
    • Prediction Guard embeddings component(191061b)
    • [ChatQnA] Support K8S Python Client to export ChatQnA E2E manifests(af4e0f8)
    • Add Megaservice support for MMRAG VideoRAGQnA usecase(2c48bc8)
    • replace langchain/langchain:latest with python:3.11-slim(6ce6551)
    • Support for UI of MultimodalRAGWithVideos in GenAIExamples(7664578)
    • [Reorg] Reorg Folder to Support Different Vendors(bea9bb0)
    • Remove fixed version in requirements.txt(f416f84)
    • Update README.md for broken/missing readme(00227b8)
    • adding embedding support for CLIP based models for VideoRAGQnA example for v0.9(2a53e25)
    • same PR as #694 but on main branch(4b5d85b)
    • doc: Fix headings(f6ae4fa)
    • Fix all the microservices which affected by langchain version upgrade(04385c9)
    • update version freeze for requirements-runtime.txt(1e4c382)
    • add contributing section to main readme(2ba3516)
    • Update embedding svc test port number(574fecf)
    • Enable GraphRAG with Neo4J(29fe569)
    • Refine READMEs after reorg(7e40475)
    • Support export megaservice yaml to docker compose file(cff0a4d)
    • Rename videoragqna to videoqna to align with other examples(2b68323)
    • Update example name into MultimodalQnA and update image names(2ca56f3)
    • Fix Reorg Issues(a3da7c1)
    • Move neuralspeed embedding rerank and vllm-xft to catalog(98c62a0)
    • fix ragagent text generator bug(42cde68)
    • Add Bias Detection Microservice(812c85c)
    • Fix intent detection code issue(4c0f527)
    • Update README.md of Table in markdown(849cac9)
    • update dependency version(4eee716)
  • CI/CD/UT

    • add PREDICTIONGUARD_API_KEY for CI(94eb60f)
    • update CI test log achieve(960f66c)
    • expand CI timeout(6c24078)
    • image scan and publish cd enhance(341f97a)
    • add resume finetuning checkpoint ut.(c718602)
    • Bug_fix.(2a91903)
    • Optimize the content of the alerts.(8a11413)
    • Add compose file.(7a21d09)
    • Remove duplicate code(8325d5d)
    • Fix image build fail issue.(3ce387a)
    • Bug fix(12fd97a)
    • enhance image publish job(9007212)
    • Dockerflie check(2705e93)
    • Make the scanning method optional.(ae71eee)
    • Modify output messages.(3e87c3b)
    • minor fix for CI detect(1785149)
    • Add OpenAI client access OPEA microservice UT cases(1b69897)
    • optimize ci test scope(4165c7d)
    • Fixed CI yaml(3ac391a)
    • Move fintuning test script path(267fb02)
    • Add E2E test for bias detection of guardrails(e29865e)
    • Add hyperlinks and paths validation.(ccdd2d0)
    • Update manual test.(2794abd)
    • Opt filecheck(61b8fa9)
    • add PREDICTIONGUARD_API_KEY for CI(94eb60f)
    • update ci action(b4a7f26)
    • update image build compose(3d00a33)
    • Adding Bias Detection Container to CI(6617e22)
    • update cd workflow(3c5fc80)
    • update torch cpu installation(0458443)
    • Fix error.(887ca75)
    • temp remove dockerfile check(2d5130f)
    • Bug_fix.(2a91903)
    • add resume finetuning checkpoint ut.(c718602)
    • Optimize the content of the alerts.(8a11413)
GenAIEvals
  • Accuracy

    • add audioqna asr wer eval scripts(cf8bd83)
    • update llm-as-judge doc.(102fcdd)
    • [v1.0] Add docker metric support(cff0a36)
    • fix issue because of ragas changes(6abbe40)
    • Add README for codegen acc test.(77bb66c)
    • Update chatqna input to fix input length(4f46a12)
    • Support bigcode eval for codegen v0.1(02b60b5)
    • Add FaqGen Accuracy scripts & Refine Ragas(4df6438)
    • update rag_eval readme(425b423)
    • fix bigcode version when python>=3.11(1d3a502)
    • add acc tuning script.(a6fd418)
  • Performance

    • [ChatQnA] Support the replica tuning for ChatQnA(484b69a)
    • Fix rerank benchmark script(8edda1c)
    • Support service-list for metrics collection in benchmark.py(58502c5)
    • Support benchmark file for w/o rerank pipeline(17d35e3)
    • Update configuration in benchmark README(514a6d6)
    • Support P50, P90, P99 for next token latency(6ac555c)
    • Support microservice level benchmark(626d269)
    • Support stresscli for codegen(907dc19)
    • Align llm microservice parameters with end to end test(476a327)
    • Fix microservice level benchmark issue(211b560)
    • Add benchmark part into top README(ac52f79)
    • Add CRAG benchmark(a9b087f)
    • [ChatQnA] Support the replica tuning for ChatQnA(484b69a)
    • add file for w/o rerank(17d35e3)
    • add bench-target as the prefix of output folder(3f0ceaf)
  • Others

    • doc: fix headings and indents(65a0a5b)
    • doc: add title to new FaqGen README(52a540d)
    • add code owners(047c479)
    • doc: fix heading level(d5dbbf0)
    • doc: fix JSON example(7318fb8)
    • Update CODEOWNERS(4db9fb3)
    • doc: update platform optimization document(d982681)
    • doc: add title to new FaqGen README(52a540d)
    • remove examples.(340f507)
    • Add hyperlinks and paths validation(df58fe5)
    • Remove useless file(0af532a)
GenAIInfra
  • GMC

    • GMC: Add a CR for switch mode on one NV GPU card(02412e7)
    • Update the GMC README based on current changes.(6f7a24e)
    • fix GMC crashes in e2e (5a2b306)
    • Add unit test for new function in GMC router(0343a2f)
    • GMC: add UT for reconcile filters(6442127)
    • Enable gmc build workflow on push(19fe1a2)
    • Doc: Fix some typos to run GMC more smoothly(59000c5)
    • Improve the performance of GMC router(68a2011)
    • GMC: enhance log(a18404e)
  • HelmChart

    • e2e helm chart: Add ui for codegen/codetrans/docsum(267d828)
    • helm: Add guardrails llama_guard support(8206a8c)
    • Enable guardrail case in helm e2e tests(491c2e2)
    • helm chart: add nginx to avoid CORS issue(353f3a5)
    • helm-chart/common: Add logging config for service components(b80ae50)
    • helm-chart/data-prep: Add the missing config for dataprep-redis(b70b914)
    • helm: use latest image tag on main branch(65b04dc)
    • helm/manifest: Update to release v0.9(182183e)
    • Add topologySpreadConstraints support(af9e1b6)
    • Add TGI additional options(bf10bdd)
    • Add vLLM inference engine support(0094f52)
    • Remove unused values and change GenAIExamples default(26f9b16)
    • 'ghcr.io/huggingface/text-generation-inference:sha-e4201f4-intel-cpu' is intel cpu(c84ac4c)
  • Documentation

    • add code owner(59ce505)
    • doc: fix headings and indenting(c10bca1)
    • doc: fix headings, spelling, inter-doc references(22d012e)
    • doc: fix image references(0a3e006)
    • Add docs for all 3 use cases of ChatQnA examples and change models for switch case(987870f)
    • doc: restructure authN-authZ directory(b9bc034)
    • Update README(9480afc)
    • doc: fix markdown issues(a339a87)
    • Doc: Fix broken links(032ddbc)
    • Enhance helm chart repo usage in README(0de5535)
    • Create troubleshooting.md(d55ded4)
  • Others

    • Fix CI bug #417(56d7d5d)
    • disable hpa-values test in chart e2e in CI(9b38302)
    • Add unit test for memory bandwidth exporter.(43adcc6)
    • Enable unit test for memory-bandwidth-exporter in CI(923c1f3)
    • add Observability for OPEA(8d304ac)
    • fix a badcommit in #383(406bbc2)
    • Add dataprep CR for NV platform(fa9788d)
    • Add memory bandwidth exporter for AI workload.(9107af9)
    • authN-authZ: update configs(0f5cef1)
    • E2E: exclude terminating pods when wait_util_all_pod_ready(39fb55e)
    • Add gateway guardrails(b22fc52)
    • fix #314(f9204f0)
    • v0.9 charts release(b2328b8)
    • Restructure the directory of config sample and update the e2e test(326a637)
    • Enhance ut(96cd929)
    • improve cd workflows and add release document(a4398b0)
    • Add HPA support to ChatQnA(cab7a88)
    • Add some NVIDIA platform support docs and scripts(cad2fc3)
    • Expose options of memory bandwidth exporter in k8s manifests and docker for user configuration(2517e79)
    • Update the image version for ChatQnA examples(593458c)
    • Update top level README(b224b65)
    • Enable OIDC based Authentication with apisix(ee907d6)
    • HPA improvements(8d86fff)
    • authn-authz: fix CORS issue and refine doc(994250c)
    • Add hyperlinks and paths validation(d8cd3a1)