Replies: 1 comment
-
defmodule RAG.DataCollector do
def fetch_and_chunk_docs(urls) do
Enum.flat_map(urls, &process_directory/1)
end
defp process_directory(url) do
extract_chunks = fn file ->
case file do
%{"type" => "file", "name" => name, "download_url" => download_url} ->
if String.ends_with?(name, ".md") do
Req.get!(download_url).body
|> TextChunker.split(format: :markdown)
|> Enum.map(&Map.get(&1, :text))
else
[]
end
_ -> []
end
end
Req.get!(url).body
|> Enum.flat_map(fn file -> extract_chunks.(file) end)
end
end guides = [
"https://api.github.com/repos/phoenixframework/phoenix_live_view/contents/guides/server",
"https://api.github.com/repos/phoenixframework/phoenix_live_view/contents/guides/client"
]
chunks = RAG.DataCollector.getch_and_chunk_docs(guides) ["# Assigns and HEEx templates\n\nAll of the data in a LiveView is stored in the socket, which is a server \nside struct called `Phoenix.LiveView.Socket`. Your own data is stored\nunder the `assigns` key of said struct. The server data is never shared\nwith the client beyond what your template renders.\n\nPhoenix template language is called HEEx (HTML+EEx). EEx is Embedded \nElixir, an Elixir string template engine. Those templates\nare either files with the `.heex` extension or they are created\ndirectly in source files via the `~H` sigil. You can learn more about\nthe HEEx syntax by checking the docs for [the `~H` sigil](`Phoenix.Component.sigil_H/2`).\n\nThe `Phoenix.Component.assign/2` and `Phoenix.Component.assign/3`\nfunctions help store those values. Those values can be accessed\nin the LiveView as `socket.assigns.name` but they are accessed\ninside HEEx templates as `@name`.\n\nIn this section, we are going to cover how LiveView minimizes\nthe payload over the wire by understanding the interplay between\nassigns and templates.\n",
"\n## Change tracking\n\nWhen you first render a `.heex` template, it will send all of the\nstatic and dynamic parts of the template to the client. Imagine the\nfollowing template:\n\n```heex\n<h1><%= expand_title(@title) %></h1>\n```\n\nIt has two static parts, `<h1>` and `</h1>` and one dynamic part\nmade of `expand_title(@title)`. Further rendering of this template\nwon't resend the static parts and it will only resend the dynamic\npart if it changes.\n\nThe tracking of changes is done via assigns. If the `@title` assign\nchanges, then LiveView will execute the dynamic parts of the template,\n`expand_title(@title)`, and send\nthe new content. If `@title` is the same, nothing is executed and\nnothing is sent.\n\nChange tracking also works when accessing map/struct fields.\nTake this template:\n\n```heex\n<div id={\"user_\#{@user.id}\"}>\n <%= @user.name %>\n</div>\n```\n\nIf the `@user.name` changes but `@user.id` doesn't, then LiveView\nwill re-render only `@user.name` and it will not execute or resend `@user.id`\nat all.\n\nThe change tracking also works when rendering other templates as\nlong as they are also `.heex` templates:\n\n```heex\n<%= render \"child_template.html\", assigns %>\n```\n\nOr when using function components:\n\n```heex\n<.show_name name={@user.name} />\n```\n\nThe assign tracking feature also implies that you MUST avoid performing\ndirect operations in the template. For example, if you perform a database\nquery in your template:\n\n```heex\n<%= for user <- Repo.all(User) do %>\n <%= user.name %>\n<% end %>\n```\n\nThen Phoenix will never re-render the section above, even if the number of\nusers in the database changes. Instead, you need to store the users as\nassigns in your LiveView before it renders the template:\n\n assign(socket, :users, Repo.all(User))\n\nGenerally speaking, **data loading should never happen inside the template**,\nregardless if you are using LiveView or not. The difference is that LiveView\nenforces this best practice.\n",
"\n## Pitfalls\n\nThere are some common pitfalls to keep in mind when using the `~H` sigil\nor `.heex` templates inside LiveViews.\n\n### Variables\n\nDue to the scope of variables, LiveView has to disable change tracking\nwhenever variables are used in the template, with the exception of\nvariables introduced by Elixir block constructs such as `case`,\n`for`, `if`, and others. Therefore, you **must avoid** code like\nthis in your HEEx templates:\n\n```heex\n<% some_var = @x + @y %>\n<%= some_var %>\n```\n\nInstead, use a function:\n\n```heex\n<%= sum(@x, @y) %>\n```\n\nSimilarly, **do not** define variables at the top of your `render` function\nfor LiveViews or LiveComponents. Since LiveView cannot track `sum` or `title`,\nif either value changes, both must be re-rendered by LiveView.\n\n def render(assigns) do\n sum = assigns.x + assigns.y\n title = assigns.title\n\n ~H\"\"\"\n <h1><%= title %></h1>\n\n <%= sum %>\n \"\"\"\n end\n\nInstead use the `assign/2`, `assign/3`, `assign_new/3`, and `update/3`\nfunctions to compute it. Any assign defined or updated this way will be marked as\nchanged, while other assigns like `@title` will still be tracked by LiveView.\n\n assign(assigns, sum: assigns.x + assigns.y)\n\nThe same functions can be used inside function components too:\n\n attr :x, :integer, required: true\n attr :y, :integer, required: true\n attr :title, :string, required: true\n def sum_component(assigns) do\n assigns = assign(assigns, sum: assigns.x + assigns.y)\n\n ~H\"\"\"\n <h1><%= @title %></h1>\n\n <%= @sum %>\n \"\"\"\n end\n\nGenerally speaking, avoid accessing variables inside `HEEx` templates, as code that\naccess variables is always executed on every render. The exception are variables\nintroduced by Elixir's block constructs. For example, accessing the `post` variable\ndefined by the comprehension below works as expected:\n\n```heex\n<%= for post <- @posts do %>\n ...\n<% end %>\n```\n",
"\n### The `assigns` variable\n\nWhen talking about variables, it is also worth discussing the `assigns`\nspecial variable. Every time you use the `~H` sigil, you must define an\n`assigns` variable, which is also available on every `.heex` template.\nHowever, we must avoid accessing this variable directly inside templates\nand instead use `@` for accessing specific keys. This also applies to\nfunction components. Let's see some examples.\n\nSometimes you might want to pass all assigns from one function component to\nanother. For example, imagine you have a complex `card` component with \nheader, content and footer section. You might refactor your component\ninto three smaller components internally:\n\n```elixir\ndef card(assigns) do\n ~H\"\"\"\n <div class=\"card\">\n <.card_header {assigns} />\n <.card_body {assigns} />\n <.card_footer {assigns} />\n </div>\n \"\"\"\nend\n\ndefp card_header(assigns) do\n ...\nend\n\ndefp card_body(assigns) do\n ...\nend\n\ndefp card_footer(assigns) do\n ...\nend\n```\n\nBecause of the way function components handle attributes, the above code will\nnot perform change tracking and it will always re-render all three components\non every change.\n\nGenerally, you should avoid passing all assigns and instead be explicit about\nwhich assigns the child components need:\n\n```elixir\ndef card(assigns) do\n ~H\"\"\"\n <div class=\"card\">\n <.card_header title={@title} class={@title_class} />\n <.card_body>\n <%= render_slot(@inner_block) %>\n </.card_body>\n <.card_footer on_close={@on_close} />\n </div>\n \"\"\"\nend\n```\n\nIf you really need to pass all assigns you should instead use the regular\nfunction call syntax. This is the only case where accessing `assigns` inside\ntemplates is acceptable:\n\n```elixir\ndef card(assigns) do\n ~H\"\"\"\n <div class=\"card\">\n <%= card_header(assigns) %>\n <%= card_body(assigns) %>\n <%= card_footer(assigns) %>\n </div>\n \"\"\"\nend\n",
"```\n\nThi |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We consider 2 types of documents in the Phoenix_LiveView repo: markdown files, and Elixir modules (which contains
moduledoc
of particular interest).GitHub serves pages with the endpoint "https://raw.githuusercontent.com/...". We can also get the list from the GitHub API at the endpoint "https://api.github.com/repose//...."
To chunk, I tested the package TextChunker. It divides the text into smaller chunk in a hierarchical and iterative manner using a set of separators.
I used
[format: :markdown]
for ".md" documents and nothing for the ".html" documents.Chunk sizes are not "small": from 600 to 2000 codepoints.
The result is:
Beta Was this translation helpful? Give feedback.
All reactions