Skip to content

Commit

Permalink
Download video(s) (first iteration) (#5)
Browse files Browse the repository at this point in the history
* Updated package.json (and made an excuse to make a branch)

* Video filepath parser (#6)

* Restructured files; Added parser placeholder

* More restructuring

* Added basic parser for hydrating template strings

* Improved docs

* More docs

* Initial implementation of media profiles (#7)

* [WIP] Added basic video download method

* [WIP] Very-WIP first steps at parsing options and downloading

* Made my options safe by default and removed special safe versions

* Ran html generator for mediaprofile model - leaving as-is for now

* Addressed a bunch of TODO comments

* Add "channel" type Media Source (#8)

* [WIP] Working on fetching channel metadata in yt-dlp backend

* Finished first draft of methods to do with querying channels

* Renamed CommandRunnerMock to have a more descriptive name

* Ran the phx generator for the channel model

* Renamed Downloader namespace to MediaClient

* [WIP] saving before attempting LiveView

* LiveView did not work out but here's a working controller how about

* Index a channel (#9)

* Ran a MediaItem generator; Reformatted to my liking

* [WIP] added basic index function

* setup oban

* Added basic Oban job for indexing

* Added in workers for indexing; hooked them into record creation flow

* Added a task model with a phx generator

* Tied together tasks with jobs and channels

* Download indexed videos (#10)

* Clarified documentation

* more comments

* [WIP] hooked up basic video downloading; starting work on metadata

* Added metadata model and parsing

Adding the metadata model made me realize that, in many cases, yt-dlp
returns undesired input in stdout, breaking parsing. In order to get
the metadata model working, I had to change the way in which the app
interacts with yt-dlp. Now, output is written as a file to disk which
is immediately re-read and returned.

* Added tests for video download worker

* Hooked up video downloading to the channel indexing pipeline

* Adds tasks for media items

* Updated video metadata parser to extract the title

* Ran linting
  • Loading branch information
kieraneglin authored Jan 31, 2024
1 parent fee3915 commit a5e7c48
Show file tree
Hide file tree
Showing 96 changed files with 9,358 additions and 158 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,4 @@ npm-debug.log

/.elixir_ls
.env
.DS_Store
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,7 @@ RUN chmod +x ./docker-run.sh

# Install Elixir deps
RUN mix deps.get
# Gives us iex shell history
ENV ERL_AFLAGS="-kernel shell_history enabled"

EXPOSE 4008
4 changes: 4 additions & 0 deletions assets/yarn.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.
# yarn lockfile v1


12 changes: 11 additions & 1 deletion config/config.exs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@ config :pinchflat,
generators: [timestamp_type: :utc_datetime],
# Specifying backend data here makes mocking and local testing SUPER easy
yt_dlp_executable: System.find_executable("yt-dlp"),
yt_dlp_runner: Pinchflat.DownloaderBackends.YtDlp.CommandRunner
yt_dlp_runner: Pinchflat.MediaClient.Backends.YtDlp.CommandRunner,
# TODO: figure this out
media_directory: :not_implemented,
metadata_directory: Path.join([System.tmp_dir!(), "pinchflat", "metadata"])

# Configures the endpoint
config :pinchflat, PinchflatWeb.Endpoint,
Expand All @@ -25,6 +28,13 @@ config :pinchflat, PinchflatWeb.Endpoint,
pubsub_server: Pinchflat.PubSub,
live_view: [signing_salt: "/t5878kO"]

config :pinchflat, Oban,
repo: Pinchflat.Repo,
# Keep old jobs for 30 days for display in the UI
plugins: [{Oban.Plugins.Pruner, max_age: 30 * 24 * 60 * 60}],
# TODO: consider making this an env var or something?
queues: [default: 10, media_indexing: 2, media_fetching: 2]

# Configures the mailer
#
# By default it uses the "Local" adapter which stores the emails
Expand Down
4 changes: 4 additions & 0 deletions config/dev.exs
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
import Config

config :pinchflat,
media_directory: Path.join([File.cwd!(), "tmp", "videos"]),
metadata_directory: Path.join([File.cwd!(), "tmp", "metadata"])

# Configure your database
config :pinchflat, Pinchflat.Repo,
username: System.get_env("POSTGRES_USER"),
Expand Down
6 changes: 5 additions & 1 deletion config/test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@ import Config

config :pinchflat,
# Specifying backend data here makes mocking and local testing SUPER easy
yt_dlp_executable: Path.join([File.cwd!(), "/test/support/scripts/yt-dlp-mocks/repeater.sh"])
yt_dlp_executable: Path.join([File.cwd!(), "/test/support/scripts/yt-dlp-mocks/repeater.sh"]),
media_directory: Path.join([System.tmp_dir!(), "videos"]),
metadata_directory: Path.join([System.tmp_dir!(), "metadata"])

config :pinchflat, Oban, testing: :manual

# Configure your database
#
Expand Down
3 changes: 3 additions & 0 deletions ideas.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- Write media datbase ID as metadata/to file/whatever so it gives us an option to retroactively match media to the DB down the line. Useful if someone moves the media without informing the UI
- Use a UUID for the media database ID (or at least alongside it)
- Look into this and its recommended plugins https://hexdocs.pm/ex_check/readme.html
1 change: 1 addition & 0 deletions lib/pinchflat/application.ex
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ defmodule Pinchflat.Application do
children = [
PinchflatWeb.Telemetry,
Pinchflat.Repo,
{Oban, Application.fetch_env!(:pinchflat, Oban)},
{DNSCluster, query: Application.get_env(:pinchflat, :dns_cluster_query) || :ignore},
{Phoenix.PubSub, name: Pinchflat.PubSub},
# Start the Finch HTTP client for sending emails
Expand Down
7 changes: 0 additions & 7 deletions lib/pinchflat/downloader_backends/backend_command_runner.ex

This file was deleted.

58 changes: 0 additions & 58 deletions lib/pinchflat/downloader_backends/yt_dlp/command_runner.ex

This file was deleted.

21 changes: 0 additions & 21 deletions lib/pinchflat/downloader_backends/yt_dlp/video_collection.ex

This file was deleted.

75 changes: 75 additions & 0 deletions lib/pinchflat/media.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
defmodule Pinchflat.Media do
@moduledoc """
The Media context.
"""

import Ecto.Query, warn: false

alias Pinchflat.Repo
alias Pinchflat.Tasks
alias Pinchflat.Media.MediaItem
alias Pinchflat.MediaSource.Channel

@doc """
Returns the list of media_items. Returns [%MediaItem{}, ...].
"""
def list_media_items do
Repo.all(MediaItem)
end

@doc """
Returns a list of pending media_items for a given channel, where
pending means the `video_filepath` is `nil`.
Returns [%MediaItem{}, ...].
"""
def list_pending_media_items_for(%Channel{} = channel) do
from(
m in MediaItem,
where: m.channel_id == ^channel.id and is_nil(m.video_filepath)
)
|> Repo.all()
end

@doc """
Gets a single media_item.
Returns %MediaItem{}. Raises `Ecto.NoResultsError` if the Media item does not exist.
"""
def get_media_item!(id), do: Repo.get!(MediaItem, id)

@doc """
Creates a media_item. Returns {:ok, %MediaItem{}} | {:error, %Ecto.Changeset{}}.
"""
def create_media_item(attrs) do
%MediaItem{}
|> MediaItem.changeset(attrs)
|> Repo.insert()
end

@doc """
Updates a media_item. Returns {:ok, %MediaItem{}} | {:error, %Ecto.Changeset{}}.
"""
def update_media_item(%MediaItem{} = media_item, attrs) do
media_item
|> MediaItem.changeset(attrs)
|> Repo.update()
end

@doc """
Deletes a media_item and its associated tasks.
Returns {:ok, %MediaItem{}} | {:error, %Ecto.Changeset{}}.
"""
def delete_media_item(%MediaItem{} = media_item) do
Tasks.delete_tasks_for(media_item)
Repo.delete(media_item)
end

@doc """
Returns an `%Ecto.Changeset{}` for tracking media_item changes.
"""
def change_media_item(%MediaItem{} = media_item, attrs \\ %{}) do
MediaItem.changeset(media_item, attrs)
end
end
38 changes: 38 additions & 0 deletions lib/pinchflat/media/media_item.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
defmodule Pinchflat.Media.MediaItem do
@moduledoc """
The MediaItem schema.
"""

use Ecto.Schema
import Ecto.Changeset

alias Pinchflat.Tasks.Task
alias Pinchflat.MediaSource.Channel
alias Pinchflat.Media.MediaMetadata

@required_fields ~w(media_id channel_id)a
@allowed_fields ~w(title media_id video_filepath channel_id)a

schema "media_items" do
field :title, :string
field :media_id, :string
field :video_filepath, :string

belongs_to :channel, Channel

has_one :metadata, MediaMetadata, on_replace: :update

has_many :tasks, Task

timestamps(type: :utc_datetime)
end

@doc false
def changeset(media_item, attrs) do
media_item
|> cast(attrs, @allowed_fields)
|> cast_assoc(:metadata, with: &MediaMetadata.changeset/2, required: false)
|> validate_required(@required_fields)
|> unique_constraint([:media_id, :channel_id])
end
end
28 changes: 28 additions & 0 deletions lib/pinchflat/media/media_metadata.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
defmodule Pinchflat.Media.MediaMetadata do
@moduledoc """
The MediaMetadata schema.
Look. Don't @ me about Metadata vs. Metadatum. I'm very sensitive.
"""

use Ecto.Schema
import Ecto.Changeset

alias Pinchflat.Media.MediaItem

schema "media_metadata" do
field :client_response, :map

belongs_to :media_item, MediaItem

timestamps(type: :utc_datetime)
end

@doc false
def changeset(media_metadata, attrs) do
media_metadata
|> cast(attrs, [:client_response])
|> validate_required([:client_response])
|> unique_constraint([:media_item_id])
end
end
7 changes: 7 additions & 0 deletions lib/pinchflat/media_client/backends/backend_command_runner.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
defmodule Pinchflat.MediaClient.Backends.BackendCommandRunner do
@moduledoc """
A behaviour for running CLI commands against a downloader backend
"""

@callback run(binary(), keyword(), binary()) :: {:ok, binary()} | {:error, binary(), integer()}
end
32 changes: 32 additions & 0 deletions lib/pinchflat/media_client/backends/yt_dlp/channel.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
defmodule Pinchflat.MediaClient.Backends.YtDlp.Channel do
@moduledoc """
Contains utilities for working with a channel's videos
"""

use Pinchflat.MediaClient.Backends.YtDlp.VideoCollection
alias Pinchflat.MediaClient.ChannelDetails

@doc """
Gets a channel's ID and name from its URL.
yt-dlp does not _really_ have channel-specific functions, so
instead we're fetching just the first video (using playlist_end: 1)
and parsing the channel ID and name from _its_ metadata
Returns {:ok, %ChannelDetails{}} | {:error, any, ...}.
"""
def get_channel_details(channel_url) do
opts = [playlist_end: 1]

with {:ok, output} <- backend_runner().run(channel_url, opts, "%(.{channel,channel_id})j"),
{:ok, parsed_json} <- Phoenix.json_library().decode(output) do
{:ok, ChannelDetails.new(parsed_json["channel_id"], parsed_json["channel"])}
else
err -> err
end
end

defp backend_runner do
Application.get_env(:pinchflat, :yt_dlp_runner)
end
end
Loading

0 comments on commit a5e7c48

Please sign in to comment.