Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's start from the basics and work our way to what users need. #178

Open
dead-claudia opened this issue Apr 13, 2024 · 22 comments
Open

Let's start from the basics and work our way to what users need. #178

dead-claudia opened this issue Apr 13, 2024 · 22 comments

Comments

@dead-claudia
Copy link
Contributor

dead-claudia commented Apr 13, 2024

Warning: this initial comment is a bit lengthy. About 2.5k words, or about a 10-15 minute read.


Synthesizing a number of issues and comments of mine, along with some discussion on Discord, into something that shows what I'm envisioning.

The current API looks like this:

declare namespace Signal {
    export class State<T> {
        constructor(t: T, options?: SignalOptions<T, this>);
        get(): T;
        set(t: T): void;
    }

    export class Computed<T> {
        constructor(cb: (this: Computed<T>) => T, options?: SignalOptions<T, this>);
        get(): T;
    }

    export namespace subtle {
        export function untrack<T>(cb: () => T): T;
        export function currentComputed(): Computed | null;
        export function introspectSources(s: Computed | Watcher): (State | Computed)[];
        export function introspectSinks(s: State | Computed): (Computed | Watcher)[];
        export function hasSinks(s: State | Computed): boolean;
        export function hasSources(s: Computed | Watcher): boolean;

        export class Watcher {
            constructor(notify: (this: Watcher) => void);
            watch(...s: Array<State | Computed>): void;
            unwatch(...s: Array<State | Computed>): void;
            getPending(): Array<State | Computed>;
        }

        export var watched: Symbol;
        export var unwatched: Symbol;
    }

    export interface SignalOptions<T, S extends State<T> | Computed<T>> {
        readonly equals?: (this: S, t: T, t2: T) => boolean;
        readonly [Signal.subtle.watched]?: (this: S) => void;
        readonly [Signal.subtle.unwatched]?: (this: S) => void;
    }
}

That is a lot of API surface area. It's also a lot of bug surface area. And most of it could easily be managed in userland. Not to mention, the common case isn't many watchers, but one.

So I have an idea: let's go back to the drawing board and work through the basics to figure out what everyone needs. After that, let's then smooth out the edges to keep it simple, concise, and at least somewhat easy to use. Then, once we've done all that, we can add a few major nice-to-haves to improve everyone's experience.

Primitives

The most basic primitives for this model of cell-like reactive programming are the signal and the watcher.

  • Signals can be read, they can be written, and they can be declared.
  • Some signals may depend on other signals for their data. Others only depend on values, and are independent of any other signal.
  • Watchers exist to react to signal assignment.

These don't need to correlate directly with anything. Signals don't need to be objects, watchers don't need to be objects, and signal dependencies don't need to be properties of anything. Let's borrow a syntax proposed in this comment in #105 as pseudo-code to show how each of these work out.

Depend only on value Depend on another signal
Declare signal direct = value signal computed = direct
Read value = direct value = computed
Write direct = value computed = value/direct

The current API can only capture some of this:

Depend only on value Depend on another signal
Declare direct = new State(value) computed = new Computed(() => direct.get())
Read value = direct.get() value = computed.get()
Write direct.set(value) ???

There's a neat little trick here that could be exploited to recover that missing piece, though: store the body in a Signal.State and just read and invoke it on every update. Here's how one could do it with the current API:

class WritableComputed extends Signal.Computed {
    #state

    constructor(body, options) {
        const state = new Signal.State(body)
        super(() => state.get()(), options)
        this.#state = state
    }

    set(newBody) {
        this.#state.set(newBody)
    }
}

Turns out, if you want to round out that table, all you need to do is instead use that relatively simple WritableComputed wrapper class:

Depend only on value Depend on another signal
Declare direct = new State(value) computed = new WritableComputed(() => direct.get())
Read value = direct.get() value = computed.get()
Write direct.set(value) computed.set(() => value/direct.get())

So clearly, that's not a true primitive.

We do of course still need to be able to watch signals. We could just model that as signal.watch(notify) to add, signal.unwatch(notify) to remove. And notify() would be called each time a signal becomes dirty (read: whenever the watched signal or any signal it depends on is .set()).

To preserve batching, signals need to suppress invoking their notifier until instructed otherwise. One might think a way to do this is to do one-shot watchers, but that'll interfere with potential "on tracked"/"on untracked" hooks. So it'd be easier to just add a signal.unblock(notify) instead.

So far, we've found we need the following:

Note: signal represents either a state or a computed.

  • state = new State(initial)
  • computed = new Computed(body)
  • value = signal.get()
  • state.set(value)
  • signal.watch(notify)
  • signal.unwatch(notify)
  • signal.unblock(notify)

Exceptions and hooks

Now that we know what our base object types are and our most primitive methods, let's go over another elephant in the room: exceptions. Unfortunately, most things can throw them. Even in the language, they're everywhere. Even most expressions can throw them if you don't first coerce their operands (and the only coercion that can't throw is boolean coercion).

So now we have to add that in. One easy way is to just have Computed cache its exception. Easy enough, right?

Nope. Let's look at a simple throttling combinator that, if a call is throttled, updates one last time after the throttle duration expires so it doesn't lose any important updates.

function throttle(signal, ms) {
    let lastAccessed = -Infinity
    let timer, value

    const refresh = new State(undefined, {
        equals: () => false, // always update
    })

    const doRefresh = () => {
        refresh.set(undefined)
    }

    return new Computed(() => {
        const now = performance.now()
        const next = signal.get()
        refresh.get()
        const elapsed = now - lastAccessed

        if (elapsed < ms) {
            timer ??= setTimeout(doRefresh, ms - elapsed)
            return value
        }

        if (timer) {
            clearTimeout(timer)
            timer = undefined
        }

        lastAccessed = now
        return value = next
    })
}

We're already up to one hook: equals. The use of this isn't just in stuff like Immutable.js collections, but also cases where we just need to always update regardless of value.

This looks simple enough. But it has a neat little pitfall: imagine if signal is new Computed(() => { throw new Error("wat") }). Do we cache the error? If you say yes, imagine if your caller does this:

// Imagine `parseQuery` has a bug and is now throwing every time it's invoked.
const parsedQuery = new Computed(() => parseQuery(searchTypeahead.get()))

const throttledQuery = throttle(parsedQuery, 1000)

const fetchedResult = signalFunction(async (signal) => Promise.all([
    fetch("/notify", {signal}),
    fetch(`/search?query=${throttledQuery.get()}`, {signal}),
]))

// And in your view
searchField.onkeypress = () => {
    searchTypeahead.set(searchField.value)
}

Suddenly, you're spamming /notify over 10 times a second. And almost certainly you'll be encountering frame rate issues. Definitely not a good user experience.

So yep, we have to capture that error:

function throttle(signal, ms) {
    let lastAccessed = -Infinity
    let isError, timer, value

    const refresh = new State(undefined, {
        equals: () => false, // always update
    })

    const doRefresh = () => {
        refresh.set(undefined)
    }

    return new Computed(() => {
        const now = performance.now()
        let nextError = false
        let next

        try {
            next = signal.get()
        } catch (e) {
            next = e
            nextError = true
        }

        refresh.get()
        const elapsed = now - lastAccessed
        if (elapsed < ms) {
            timer ??= setTimeout(doRefresh, ms - elapsed)
            if (isError) throw value
            return value
        }

        if (timer) {
            clearTimeout(timer)
            timer = undefined
        }

        lastAccessed = now

        isError = nextError
        value = next

        if (nextError) throw next
        return next
    })
}

Of course, that's easy enough with that mostly-sync signal. Doesn't require explicit API surface, just a Computed that caches exceptions.

But we're all set now, right? Nope, we're still potentially invoking timers that might not even get used. Let's save whatever watcher some trouble and clean up after ourselves. We'll need a hook letting us know when we no longer have anyone watching us. Let's call that untrack.

// ...
const refresh = new State(undefined, {
    equals: () => false, // always update
    untrack() {
        if (timer) clearTimeout(timer)
    },
})

// ...

Async and Streaming

Suppose we want to convert an EventSource into a signal. The API is simple to wrap. There's three events, open, message, and error, and there's no extra properties to fuss with.

Obviously, only lossy sources should be wrapped like this.

function eventSource(url, options=undefined) {
    const source = new EventSource(url, options)
    const names = options?.names ?? ["message"]

    const initialState = Object.fromEntries(names.map((n) => [n, undefined]))
    const state = new State({value: initialState, isError: false}, {
        equals: (a, b) => (
            a.isError === b.isError &&
            (a.isError ? a.name === b.name && a.message === b.message : Object.is(a, b))
        ),
        untrack() {
            source.close()
        },
    })

    const onError = () => {
        state.set({value: new Error("Event source encountered a connection error"), isError: true})
    }

    const onMessage = (ev) => {
        let prev = initialState
        try {
            prev = state.get()
        } catch {
            // ignore
        }
        state.set({
            value: {
                ...prev,
                [ev.type]: ev.data,
            },
            isError: false,
        })
    }

    source.onerror = () => {
        source.onopen = source.onerror = null
        state.set({value: new Error("Event source failed to connect"), isError: true})
    }

    source.onopen = () => {
        source.onopen = null
        source.onerror = onError
        for (const name of names) source.addEventListener(name, onMessage)
    }

    return new Computed(() => {
        const {value, isError} = state.get()
        if (isError) throw value
        return value
    })
}

Thing is, this is a bit too eager. Let's make this lazier. A lot lazier. We might not need the event source right away, so why should we create one? Also, suppose we start watching the event source, stop watching it, and then start watching it again, all from the same signal. If that happens, we now have a disconnected source that isn't reconnecting. Let's instead allocate it on demand, so everything "just" works. And in order to do that, we need a track hook.

function eventSource(url, options=undefined) {
    const names = options?.names ?? ["message"]

    const initialState = Object.fromEntries(names.map((n) => [n, undefined]))
    let source

    const cleanup = () => {
        if (source) {
            source.onopen = source.onerror = null
            for (const name of names) source.removeEventListener(name, onMessage)
            source = undefined
        }
    }

    const onOpenError = () => {
        cleanup()
        state.set({value: new Error("Event source failed to connect"), isError: true})
    }

    const onOpen = () => {
        source.onopen = null
        source.onerror = onError
        for (const name of names) source.addEventListener(name, onMessage)
    }

    const onError = () => {
        state.set({value: new Error("Event source encountered a connection error"), isError: true})
    }

    const onMessage = (ev) => {
        let prev = initialState
        try {
            prev = state.get()
        } catch {
            // ignore
        }
        state.set({
            value: {
                ...prev,
                [ev.type]: ev.data,
            },
            isError: false,
        })
    }

    const state = new State({value: initialState, isError: false}, {
        equals: (a, b) => (
            a.isError === b.isError &&
            (a.isError ? a.name === b.name && a.message === b.message : Object.is(a, b))
        ),
        track() {
            source = new EventSource(url, options)
            source.onerror = onOpenError
            source.onopen = onOpen
        },
        untrack() {
            source?.close()
            cleanup()
        },
    })

    return new Computed(() => {
        const {value, isError} = state.get()
        if (isError) throw value
        return value
    })
}

What about that signalFunction from earlier. How would we implement that? Once again, we need exceptions.

export function signalFunction(body, options=undefined) {
    let ctrl

    function cleanup() {
        const prev = ctrl
        ctrl = undefined
        if (prev !== undefined) {
            try {
                prev.abort()
            } catch (e) {
                reportError(e)
            }
        }
    }

    const state = new State(undefined, {
        untrack: cleanup,
    })

    const invoker = new Computed(() => {
        cleanup()

        const c = ctrl = new AbortController()

        new Promise((resolve) => {
            resolve(body(c.signal))
        }).then(
            (value) => {
                if (c === ctrl) state.set({value, isError: false})
            },
            (e) => {
                if (c === ctrl) {
                    state.set({value: e, isError: true})
                } else {
                    reportError(e)
                }
            },
        )
    })

    return new Computed(() => {
        // Register the dependency and start the body.
        invoker.get()
        const {value, isError} = state.get()
        if (isError) throw value
        return value
    }, options)
}

Simple enough. Didn't need much. But we have a problem: how do we know data is ready? Imagine we have code like this using it:

// Define
function detailsInfo() {
    let handler
    const event = new State(undefined, {
        track() {
            emitter.on("foo", handler = (v) => this.set(v))
        },
        untrack() {
            emitter.off("foo", handler)
        },
    })

    const resource = signalFunction(async () => {
        const response = await fetch(`/thing/${Model.currentId.get()}/detail`)
        if (!response.ok) throw new Error(`Response failed: ${await response.text()}`)
        return response.json()
    })

    return new Computed(() => {
        // combine `event` and `resource` somehow
        return {
            lastEvent: event.get(),
            currentDetail: resource.get(),
        }
    })
}

// Use
const state = getStateSomehow()

// Suppose this `effect` is wired up by a framework using signals.
effect(() => {
    updateUI(state.get())
})

// And in your view
return <ThingCard
    thing={thing}
    onSelect={() => Model.currentId.set(thing.id)}
/>

Now, suppose we were just told to show a loading spinner modal while we're waiting. Unfortunately, this means we need something to tell us out of band when our data is meaningfully "ready". And the simplest way? Throw a .isPending to tell us we're still waiting on the data to come.

const state = getStateSomehow()

effect(() => {
    // Yep, just one line.
    if (state.isPending) showLoadingSpinner()
    updateUI(state.get())
})

That's all fine and dandy, but how do we get that? Well, we need two things:

  • A way to set it to some state
  • A way to propagate it

So, how about we augment state.set to accept an extra isPending option. Now it looks like state.set(value, {isPending}), where isPending defaults to false (as most things are sync and we aren't waiting on them). And let's have Computed propagate that flag from its dependencies. So the above state.isPending works, signal.isPending will also need to be auto-tracked similarly to state.get.

Now, we can fix signalFunction to reflect that state.

We also need a runWithoutTrackingGets function to let us .get() the old value without

export function signalFunction(body, options=undefined) {
    let ctrl

    function cleanup() {
        const prev = ctrl
        ctrl = undefined
        if (prev !== undefined) {
            try {
                prev.abort()
            } catch (e) {
                reportError(e)
            }
        }
    }

    const state = new State(undefined, {
        untrack: cleanup,
    })

    const invoker = new Computed(() => {
        cleanup()
        state.set(runWithoutTrackingGets(() => state.get()), {isPending: true})

        const c = ctrl = new AbortController()

        new Promise((resolve) => {
            resolve(body(c.signal))
        }).then(
            (value) => {
                if (c === ctrl) state.set({value, isError: false})
            },
            (e) => {
                if (c === ctrl) {
                    state.set({value: e, isError: true})
                } else {
                    reportError(e)
                }
            },
        )
    })

    return new Computed(() => {
        // Register the dependency and start the body.
        invoker.get()
        const {value, isError} = state.get()
        if (isError) throw value
        return value
    }, options)
}

Computed options

You've probably noticed me passing an options object to the Computed constructor. Turns out you can do all of that except equals in a helper State + Watcher.

function computedWithOpts(body, options) {
    const state = new State(undefined, {...options, equals: null})

    return new Computed(() => {
        state.get()
        return body()
    })
}

Unfortunately, equals cannot be desugared so easily. I'll leave it as an academic exercise, but you'll need a watcher as well as a state variable and just some generally awkward coding to avoid accidentally linking the state variable to that inner watcher. But suffice to say, while you don't technically need equals for Computed, it'll be way simpler to implement in that method.

Introspection

Sometimes, you want to be able to optimize a data flow. If nothing changed, why bother re-getting the value at all. This comes up a fair bit with things like virtual DOM trees: if you have a signal returning a tree, you can save considerable rendering time by checking if the signal is "dirty" (read: has an update that hasn't been read yet) and skipping the whole process if it's not. This necessitates a new property on both computed and state signals: signal.isDirty.

Now, suppose you want to know what watchers are watching a given signal. Simplest way to expose that is via a simple function: watchers = signal.getWatchers(). You can also use this to figure out which signals from an array are connected to a given watcher.

In some niche use cases, it may also be useful to know what signals a given signal depends on. This can be provided very simply via deps = signal.getDependencies().

In some niche use cases, it may be helpful to know what the current Computed context is. This can be provided very simply via currentComputed().

Variadic watchers

The current Watcher class could be simulated using watch, unwatch, and isDirty. It doesn't even need any of the rest of the introspection.

class Watcher {
    #signals = new Set()
    #notify

    constructor(notify) {
        this.#notify = () => notify.call(this)
    }

    watch(...signals) {
        for (const signal of signals) {
            if (this.#signals.has(signal)) continue
            signal.watch(this.#notify)
            this.#signals.add(signal)
        }
    }

    unwatch(...signals) {
        for (const signal of signals) {
            if (this.#signals.delete(signal)) {
                signal.unwatch(this.#notify)
            }
        }
    }

    getPending() {
        return [...this.#signals].filter((signal) => signal.isDirty)
    }
}

Real core

So the real core now looks like this, where signal can be either state or computed:

  • state = new State(initial, options?)
    • options.equals(a, b)
    • options.track()
    • options.untrack()
  • computed = new Computed(body, options?)
    • options.equals(a, b)
  • state.set(value, {isPending?})
  • value = signal.get()
  • signal.isPending
  • signal.isDirty
  • signal.watch(notify)
  • signal.unwatch(notify)
  • signal.unblock(notify)
  • runWithoutTrackingGets(body)
  • watchers = signal.getWatchers()
  • deps = computed.getDependencies()

API Refinement

States are simple, and pretty much everything is needed:

  • state = new Signal.State(initial, options?)

Options:

  • options.equals(a, b)
  • options.track()
  • options.untrack()

Computeds don't have much (if any) use case for options other than equals. So I'm dropping them. It's otherwise very similar. Edit: they apparently do have a case where the track/untrack options come super in handy so they can just receive all the same options. (h/t @shaylew)

  • computed = new Signal.Computed(body, options?)

Due to problems with circularity, I specifically propose not calling non-options callbacks with this values.

And for .set, an object is rather expensive in a path potentially that hot. Let's lower that to an inline boolean.

  • state.set(value, isPending=false)

You may have noticed this pattern a lot. And I mean a lot.

// Get
const {value, isError} = state.get()
if (isError) throw value
return value

// Set result
state.set({value, isError: false})

// Set exception
state.set({value: e, isError: true})

Let's optimize this by adding a .setThrow(e, isPending=false) that stores an exception to later throw from .get(). It'll be stored the same way as it is in Computed. This also simplifies the graph walk a bit and lets Computed and State share most their .get() implementations. And while we're at it, let's make .isPending writable, since it's sometimes useful to explicitly set it without also changing the value. (If both need changed separately, it's commonly batched anyways.)

To see this in action, here's `signalFunction` re-implemented in terms of that.
export function signalFunction(body, options=undefined) {
    let ctrl

    function cleanup() {
        const prev = ctrl
        ctrl = undefined
        if (prev !== undefined) {
            try {
                prev.abort()
            } catch (e) {
                reportError(e)
            }
        }
    }

    const state = new State(undefined, {
        untrack: cleanup,
    })

    const invoker = new Computed(() => {
        cleanup()
        state.isPending = true

        const c = ctrl = new AbortController()

        new Promise((resolve) => {
            resolve(body(c.signal))
        }).then(
            (value) => {
                if (c === ctrl) state.set(value)
            },
            (e) => {
                if (c === ctrl) {
                    state.setThrow(e)
                } else {
                    reportError(e)
                }
            },
        )
    })

    return new Computed(() => {
        // Register the dependency and start the body.
        invoker.get()
        return state.get()
    }, options)
}
It also makes the `EventSource` helper a bit more readable.
function eventSource(url, options=undefined) {
    const names = options?.names ?? ["message"]

    const initialState = Object.fromEntries(names.map((n) => [n, undefined]))
    let source

    const cleanup = () => {
        if (source) {
            source.onopen = source.onerror = null
            for (const name of names) source.removeEventListener(name, onMessage)
            source = undefined
        }
    }

    const onOpenError = () => {
        cleanup()
        state.setThrow(new Error("Event source failed to connect"))
    }

    const onOpen = () => {
        source.onopen = null
        source.onerror = onError
        for (const name of names) source.addEventListener(name, onMessage)
    }

    const onError = () => {
        state.setThrow(new Error("Event source encountered a connection error"))
    }

    const onMessage = (ev) => {
        let prev = initialState
        try {
            prev = state.get()
        } catch {
            // ignore
        }
        state.set({
            ...prev,
            [ev.type]: ev.data,
        })
    }

    const state = new State({value: initialState, isError: false}, {
        track() {
            source = new EventSource(url, options)
            source.onerror = onOpenError
            source.onopen = onOpen
        },
        untrack() {
            source?.close()
            cleanup()
        },
    })

    // Just making it read-only.
    return new Computed(() => state.get())
}

The watchers are functionally the same for either type, so those make more sense as static methods.

  • Signal.watch(signal, notify)
  • Signal.unwatch(signal, notify)
  • Signal.unblock(signal, notify)

For the introspection methods, signal.getWatchers() and computed.getDependencies(), it makes more sense to just combine the two methods into a single Signal.introspect(signal) call and just throw everything in there. If more things turn up that's interesting, it's easy to extend it later. You could even extend it with other existing fields like .isDirty, .isPending, and .value. So let's just wrap all that into just that one convenience method.

There's a secondary benefit to this: on the lowest-resource embedded, if nobody calls Signal.introspect, runtimes like XS could just compile the whole method out.

It's also not like this method will be called often, so it's okay to be a bit slow.

Note that this does not count as a .get() of the signal.

  • info = Signal.introspect(signal)

For runWithoutTrackingGets, let's give it the ability to forward arguments. Once natively implemented, engines can optimize for that and inline away the boilerplate, much like what they already do for Reflect.apply and Function.prototype.call. Engines may even be able to detect the idiom Signal.runWithoutTrackingGets(() => ...) and elide the closure in JIT code.

  • result = Signal.runWithoutTrackingGets(func, ...args)

The rest is fine as it is.

  • value = signal.get()
  • signal.isPending
  • signal.isDirty

New type definitions

This is what would be exposed at the language level:

declare namespace Signal {
    export class State<T> {
        constructor(initial: T, options: StateOptions<T>)
        get(): T
        set(value: T, isPending?: boolean): void
        setThrow(value: unknown, isPending?: boolean): void
        readonly isPending: boolean
        readonly isDirty: boolean
    }

    export class Computed<T> {
        constructor(body: () => T, options: ComputedOptions<T>)
        get(): T
        readonly isPending: boolean
        readonly isDirty: boolean
    }

    export function runWithoutTrackingGets<R, A extends any[]>(func: (...args: A) => R, ...args: A): R
    export function watch(signal: State<any> | Computed<any>, notify: () => void): void
    export function unwatch(signal: State<any> | Computed<any>, notify: () => void): void
    export function unblock(signal: State<any> | Computed<any>, notify: () => void): void
    export function introspect<T>(signal: State<T> | Computed<T>): SignalIntrospection<T>
    export function currentComputed(): undefined | Computed<any>

    export interface StateOptions<T> {
        readonly equals?: (this: State<T>, a: T, b: T) => boolean
        readonly track?: (this: State<T>) => boolean
        readonly untrack?: (this: State<T>) => boolean
    }

    export interface ComputedOptions<T> {
        readonly equals?: (this: Computed<T>, a: T, b: T) => boolean
        readonly track?: (this: Computed<T>) => boolean
        readonly untrack?: (this: Computed<T>) => boolean
    }

    export interface SignalIntrospection<T> {
        value: T
        isDirty: boolean
        isPending: boolean
        watchers: Array<() => void>
        dependencies: Array<State<any> | Computed<any>>
    }
}

Possible questions

I found a typo or mistake.

Let me know and I'll fix it as soon as I can.

What exactly are you trying to accomplish here?

My goal is to identify the minimal core and re-focus the discussion to center around that. A clean sheet base design, if you will.

Even if my particular design isn't chosen, it'd be a win in my book to just steer discussion away from various features and to just focus on the core concepts and how they would be helpful.

A simple API is a secure and performant API, so I'm all for applying the KISS principle to this. Simplicity is especially important for an API attempting to be present as a low-level support API and not just a high-level userland API.

What happened to subtle? Why are we throwing it away?

The analogy to crypto.subtle is tenuous at best.

As stated there, the reason it's crypto.subtle and not just crypto is because it truly is subtle. There's an entire class of flaws that could show up in code using crypto.subtle that won't show as errors. They won't show as anything. No number of unit tests will help you find them. And when you do fix them, you often can't write a meaningful regression test to make sure it doesn't happen again. Each change to code using it requires thorough understanding of both the code and the mathematical underpinnings of the surrounding system's design. If a test vector doesn't yield the expected result, but the code otherwise executes successfully, step debuggers are frequently useless. These kinds of subtle yet extremely dangerous flaws are precisely why the API's exposed as crypto.subtle and not just crypto.

This is very much not the case for signals. Yes, you could run yourself into a very difficult time debugging if you're not careful with your states. But you will not have issues with step debuggers. And you will not be dealing with bugs you cannot write regression tests for.

Oh, and there is some precedent already for sharp tools not existing in special namespaces, even among newer additions. To name a few big ones:

  • Proxy.revocable doesn't live in a special namespace.
  • None of the proxy handler methods are symbol-keyed, despite some of them like getPrototypeOf being potentially very large footguns.
  • Resizable array buffers don't use special types. This was considered and explicitly decided against by the committee. (This is listed in meeting notes somewhere, but I forget where and don't feel like tracking it down.) Shrinkable buffers has some potential Zalgo-like interactions with typed arrays, and that was specifically discussed if I recall correctly.
  • FinalizationRegistry is a very tough API to use correctly. You can write automated tests to test stuff using it, but such tests would be inherently very flaky. And it's extremely easy to accidentally retain a strong reference to the source object - it takes special object modeling to not do so. If anything, the crypto.subtle analogy fits this far better than anything Signal-related.

Why runWithoutTrackingGets(...) and not untrack(...)?

That name really needs bikeshedding. And untrack doesn't quite encapsulate what the function really does.

Why toss out the Watcher class?

Three reasons:

  1. It's easy to roll in userland. Just see the "Variadic Watchers" section for what I mean.
  2. Most users are just going to watch a single Computed in practice.
  3. You can, in many cases, just wrap your signals in a Computed one, do what you would normally do but in that Computed's body, and call it a day.

Isn't this a lot like railway-oriented programming?

Kinda yes, but without the explicit combinators.

Isn't this a lot like $LIBRARY/FRAMEWORK?

Probably.

I'm admittedly not the most well-versed in the world of signals. I come from a deep background of React-like frameworks, being a former maintainer of a well-known React competitor (I'm still semi-active behind the scenes in that project, just not in GitHub directly). Until about a week ago, I myself was rather skeptical of the value of signals. I'm still learning the lay of the land of signals here, so please bear that in mind!

My question isn't answered here. What about $X?

Well...lemme grab my crystal ball real quick. $ANSWER.

In all seriousness, feel free to ask me anything. I'm all ears, and as I said in the last question, I'm still learning. I'm certainly not the signals expert here, so I'm pretty much expecting to have missed something significant.

I'm just a nobody. I feel my question might be stupid or a complete and utter waste of time.

No it isn't. And I'm gonna be honest: if you're afraid to ask, you're the kind of person I want to hear the most from. I'm not joking. Don't be afraid - I won't bite. And neither will the rest of us here. 🙂


cc @littledan for visibility

@dead-claudia
Copy link
Contributor Author

dead-claudia commented Apr 13, 2024

Update: added a way to block signals.

As an alternative, one could have .watch(notify) return a watcher object with .unblock() and .remove() methods. Then, signals would internally just retain a (propagated) set of watcher objects and iterate those. On .set, it'd do essentially the following:

for (const watcher of target.#watchers) {
    if (!watcher.#blocked) {
        watcher.#blocked = true
        const notify = watcher.#notify
        notify()
    }
}

You could take this one step further and allow some mild memory savings by offering something close to the current API:

  • watcher = new Watcher(notify): Create the watcher yourself
  • watcher.watch(signal): Connect the watcher to a given signal
  • watcher.unwatch(signal): Disconnect the watcher from a given signal
  • watcher.unblock(): Unblock calling of this watcher's notify. Literally just a field set internally.

This would bring the API to this:

declare namespace Signal {
    export class State<T> {
        constructor(initial: T, options: StateOptions<T>)
        get(): T
        set(value: T, isPending?: boolean): void
        setThrow(value: unknown, isPending?: boolean): void
        readonly isPending: boolean
        readonly isDirty: boolean
    }

    export class Computed<T> {
        constructor(body: () => T, options: ComputedOptions<T>)
        get(): T
        readonly isPending: boolean
        readonly isDirty: boolean
    }

    export class Watcher {
        constructor(notify: () => void)
        watch(signal: State<any> | Computed<any>): void
        unwatch(signal: State<any> | Computed<any>): void
        unblock(): void
    }

    export function runWithoutTrackingGets<R, A extends any[]>(func: (...args: A) => R, ...args: A): R
    export function introspect<T>(signal: State<T> | Computed<T>): SignalIntrospection<T>
    export function currentComputed(): undefined | Computed<any>

    export interface StateOptions<T> {
        readonly equals?: (this: State<T>, a: T, b: T) => boolean
        readonly track?: (this: State<T>) => boolean
        readonly untrack?: (this: State<T>) => boolean
    }

    export interface ComputedOptions<T> {
        readonly equals?: (this: Computed<T>, a: T, b: T) => boolean
        readonly track?: (this: Computed<T>) => boolean
        readonly untrack?: (this: Computed<T>) => boolean
    }

    export interface SignalIntrospection<T> {
        value: T
        isDirty: boolean
        isPending: boolean
        watchers: Array<Watcher>
        dependencies: Array<State<any> | Computed<any>>
    }
}

This is just one other possible solution.

@dead-claudia
Copy link
Contributor Author

Okay, after some further experimentation with #178 (comment), if you block the watcher right after, you can just get away with separate signal.set(value) and signal.isPending = ... properties. For ease of use, signal.set and signal.setThrow could do the equivalent of signal.isPending = false.

It'd simplify signalFunction a bit:

export function signalFunction(body, options=undefined) {
    let ctrl

    function cleanup() {
        const prev = ctrl
        ctrl = undefined
        if (prev !== undefined) {
            try {
                prev.abort()
            } catch (e) {
                reportError(e)
            }
        }
    }

    const state = new State(undefined, {
        untrack: cleanup,
    })

    const invoker = new Computed(() => {
        cleanup()
        state.isPending = true

        const c = ctrl = new AbortController()

        new Promise((resolve) => {
            resolve(body(c.signal))
        }).then(
            (value) => {
                if (c === ctrl) state.set(value)
            },
            (e) => {
                if (c === ctrl) {
                    state.setThrow(e)
                } else {
                    reportError(e)
                }
            },
        )
    })

    return new Computed(() => {
        // Register the dependency and start the body.
        invoker.get()
        const {value, isError} = state.get()
        if (isError) throw value
        return value
    }, options)
}

@justinfagnani
Copy link
Collaborator

Not to mention, the common case isn't many watchers, but one.

I don't think this is true at all. I can easily see apps with thousands of watchers.

@robbiespeed
Copy link

Now that we know what our base object types are and our most primitive methods, let's go over another elephant in the room: exceptions. Unfortunately, most things can throw them. Even in the language, they're everywhere. Even most expressions can throw them if you don't first coerce their operands (and the only coercion that can't throw is boolean coercion).

So now we have to add that in. One easy way is to just have Computed cache its exception. Easy enough, right?

A lot of this seems to be based around the premise that Computed's must cache their exceptions, but I don't really see any clear argument why that's the case. We know things in JS can throw, but that's not a problem limited to signals, and we already have a solution try/catch.

If someone desires caching the exception then they should try/catch in the the Computed and have it store something like a { error: unknown; value?: T } object. Otherwise an exception should either:

A) Invalidate the Computed for it's entire lifetime, drop all it's dependencies and consumers, and throw a generic broken computed error (possibly with cause). Each .get() after throws the same error. The error is thrown before the computed can be tracked.

B) The computed passes along the exception to the caller of .get() and does not cache anything, calling .get() after tries to reevaluate the value (and may throw again). Errors are thrown after computed is tracked.

A is somewhat like the error caching approach, but avoids any errors inside later (past the first) watchers, since it can no longer trigger further updates.

B is probably my preference though as it leaves the most freedom for error handling up to the user or libraries. Although auto-tracking makes it impossible for libs to decide whether it should be tracked before or after a possible error (just one more thing to add to my list of bad things about auto-tracking).

@dead-claudia
Copy link
Contributor Author

@justinfagnani I said the "common" case, not the "only" case, to be clear. There is definitely no shortage of use cases for multiple watchers, and even I ran into one recently.

It's just not the most common.

@dead-claudia
Copy link
Contributor Author

A lot of this seems to be based around the premise that Computed's must cache their exceptions, but I don't really see any clear argument why that's the case. We know things in JS can throw, but that's not a problem limited to signals, and we already have a solution try/catch.

@robbiespeed There's one core reason to cache exceptions generated by Computeds: to avoid overloading and exhausting resources in the face of faults.

Let me explain this with a story. Suppose, in a web app somewhere, you're getting an expensive Computed value. To avoid blocking the main thread, you offload it to a worker and pair it with an inner State that holds the value. The code might look like this:

let worker

const result = new Signal.State()

const fetch = new Signal.Computed(() => {
    worker.postMessage({type: "start", params: params.get()})
    showExecutingUI()
}, {
    track() {
        worker = new Worker("...")
        worker.onmessage = ev => {
            if (ev.type === "finish") {
                result.set(ev.data)
            }
        }
    },
    untrack() {
        worker.terminate()
        worker = undefined
    },
})

const expensive = new Signal.Computed(() => {
    fetch.get()
    return result.get()
})

Suppose you're accessing expensive.get() in the render function. This could be called as often as once per frame. Since you offloaded the compute-heavy part to a worker, you're safe from dropped frames.

Now, suppose showExecutingUI() throws. If you don't cache the exception, the postMessage could be called anywhere from 30 to 240 times per second depending on the display refresh rate. The worker is now overloaded, receiving tasks faster than it can execute them, and so you now have a CPU core stuck at 100%, limiting system performance, as well as a slow memory leak as tasks keep getting put into the worker's queue.

Your bug report won't just be that the UI fails to update, but low-end laptop users will also complain that you're causing their laptop to heat up and slow down. You might read that and immediately check to make sure the worker is actually posting back and not getting stuck in a loop. And so you'd dive into that red herring, not knowing it's a red herring.

If the exception was cached, the laptop temperature issue wouldn't happen, and simply asking users to report back their CPU usage would tell you right away that it's not a worker problem.

@robbiespeed
Copy link

There are many mistakes that can put a system in a compromised situation. try/catch could have solved this problem as well. You can also make your own error caching computed:

class TryComputed extends Computed {
  #innerComp;
  constructor (compute) {
    super(this.#getResult);
    this.#innerComp = new Computed(() => {
      try {
        return { isError: false, value: compute() };
      } catch (err) {
        return { isError: true, value: err };
      }
    });
  }
  #getResult() {
    const result = this.#innerComp.get();
    if (result.isError) {
      throw result.value;
    }
    return result.value;
  }
}

There are use cases where you might not desire errors to be cached. Ex:
Synchronizing non reactive state along with reactive state.

// x is not a signal and does not contain a signal, instead it has a `.isPending` which is set internally.
// y is a signal
const xPlusY = new Computed(() => {
  if (x.isPending) {
    throw new Error("x is not ready");
  }
  return x.value + y.get();
});

Caching the error would mean the computed never has any possibility to accomplish it's goal.

@dead-claudia
Copy link
Contributor Author

@robbiespeed Your abstraction is starting to look a lot like the {value, isError, isPending} async construction I made in the original post.

Also, your x.isPending, if you're going by my proposed accessor for async support, would be reactive (and it necessarily has to be). If you mean x.isDirty, that in a computed that also calls x.get() would in practice always return true for computeds and always return false for states. It's almost never what you really want.

For other non-reactive state, if you aren't accessing something before the error, updates to it will almost certainly error again as nothing truly changed. signalFunction itself has to work around this using a State + fetcher Computed pair and a Computed wrapper to drive the fetcher signal and return the state. This is pretty much how you'd handle non-reactive updates in general, for what it's worth.

@robbiespeed
Copy link

Also, your x.isPending, if you're going by my proposed accessor for async support, would be reactive (and it necessarily has to be).

Let's assume it can't be reactive, because it's from a 3rd party. If it were reactive, then throwing wouldn't be the appropriate action. The whole reason for throwing in this case is to ensure nothing gets cached, so a subsequent get() can rerun the computed.

@dead-claudia
Copy link
Contributor Author

@robbiespeed I took a step back and I think I see what you're (indirectly) really hinting at: caching is strictly not necessary at all for computeds, provided there's a way to check for dirtiness. And you're almost right. Almost.

Caching could almost in theory be done like this in userland:

function cached(body, options) {
    const equals = options?.equals ?? Object.is

    let isException = false
    let value

    const cached = new UncachedComputed(() => {
        if (cached.isDirty) {
            try {
                let next = body()
                if (isException && !equals(value, next)) {
                    value = next
                    isException = false
                }
            } catch (e) {
                value = e
                isException = true
            }
        }

        if (isException) throw value
        return value
    }, options)

    return cached
}

But there's a very subtpe problem: if equals returns true, the entire cell should be considered not dirty, and this needs to be able to block parent cells from executing. To show what I mean, consider this code:

const source = new State(0)
const isEven = new Cached(() => source.get() % 2 === 0)
const display = new Cached(() => isEven.get() ? "even" : "odd")

In this, if you set source to 2, display should not re-run, but if you set it to 1, it should. The intuition is that isEven, when recalculated, didn't truly change, even though its dependency source did. It's still the same value as before, and thus display shouldn't change.

In order to do this, you have to check it as part of parentComputed.isDirty. It can also impact uncached computeds. This is why this hypothetical Cached must be a primitive in order to function correctly.

@robbiespeed
Copy link

No I'm not hinting that caching isn't necessary at all, Computed should cache values, but not errors. UncachedComputed computed is unrelated, but probably should be part of the API since they make sense to use at leaf nodes where you know they only need to be read once per update.

In this, if you set source to 2, display should not re-run, but if you set it to 1, it should. The intuition is that isEven, when recalculated, didn't truly change, even though its dependency source did. It's still the same value as before, and thus display shouldn't change.

I don't believe Computed should be concerning itself with memoization, it should only care about storing the last value until it gets dirtied. Once it gets dirtied the old value should be thrown away otherwise you end up with memory inefficiencies. If people want memo they can implement that in user land themselves, and it only makes sense to do so for expensive computations.

const calcXthPrime = memo((x) => /* */);
const nthPrime = new Computed(() => calcXthPrime(n.get()));

@nin-jin
Copy link

nin-jin commented Apr 18, 2024

Notifications should not block - its should be executed synchronously and cascaded as early as possible, otherwise the application will go into an inconsistent state. But reactions to these notifications should already be lazy and not run unnecessarily. I described this issue here: https://dev.to/ninjin/tonus-of-reactivity-20h1
image

@dead-claudia
Copy link
Contributor Author

@robbiespeed

No I'm not hinting that caching isn't necessary at all, Computed should cache values, but not errors.

Okay, I'll go back your previous comment, then:

There are use cases where you might not desire errors to be cached. Ex:
Synchronizing non reactive state along with reactive state.

[snip code]

Those should honestly not be pure signals in the first place. And in order to even drive reads to non-reactive state correctly in the first place, you have to use a messenger signal. If there's any interior branch whatsoever that could conditionally read a signal, you already need a messenger signal - it's not unique to throwing.

I do see value in a signal type that always invokes its body (and necessarily untracking its body), as it's very useful in its own right for things like localStorage where you want the caching, but the value isn't reactive. Maybe something like new Accessor(() => ...) could work. (Note that this would be awkward to implement and would be considerably slower than other signal types as it has to always run.)

// x is not a signal and does not contain a signal, instead it has a `.isPending` which is set internally.
// y is a signal
const xPlusY = new Accessor(() => {
  if (x.isPending) {
    throw new Error("x is not ready");
  }
  return x.value + y.get();
});

I don't believe Computed should be concerning itself with memoization, it should only care about storing the last value until it gets dirtied. Once it gets dirtied the old value should be thrown away otherwise you end up with memory inefficiencies.

You can only do that if you maintain parent references. That creates significant compute overhead in .get(), and it adds potentially considerable memory overhead for each signal. Oh, and it complicates GC a bit since you have to make the references weak to prevent memory leaks.

This hidden complexity is something the proposal's been trying really, really hard to avoid. It always comes with both security and performance risks, and it also makes implementors question the value of it.

@nin-jin
Copy link

nin-jin commented Apr 18, 2024

There is a typical error in your analysis with the separation of signals into state and computed ones. However, you forget about lazy state calculation and automatic resource release. You are trying to solve these problems by further complicating the API by introducing track/untrack. But everything would be much simpler if the signal took a calculated function and memoizes its return value. I described this issue here: https://dev.to/ninjin/reactive-resources-management-39i5

class Project extends Entity {

    @mems task( id: number ) {
        return new Task( id )
    }

    destructor() {
        console.log( this, 'is destroyed' )
    }

}

class Account extends Entity {

    @mems project( id: number ) {
        return new Project( id )
    }

}

class App extends Object {

    @mem static account() {
        return new Account()
    }

    @act static main() {
        console.log( this.account().project(23).task(456), 'is used' )
    }

    static [ Symbol.toStringTag ] = 'App'

}

App.main()
// logs
//      App.account().project(23).task(456) is used
//      App.account().project(23) is destroyed

@mem is one-signal-decorator, @mems is multi-signal-decorator, @act is one-time action.

@nin-jin
Copy link

nin-jin commented Apr 18, 2024

You are mistaken in thinking that watchers do not depend on signals and other watchers. If they are triggered independently of each other, then this leads to unstable behavior. Watchers should be assembled into a tree with a single root in order to have a deterministic order of calculation and ensure the correct order of dynamic (un)subscriptions in the process of updating the state graph. I described this issue here: https://dev.to/ninjin/reactive-computing-order-1ef5

image

Correct order. Note that the PostPage was destroyed before it would have started to run its effects.

@nin-jin
Copy link

nin-jin commented Apr 18, 2024

Well, why errors and promises should be cached in the same way as the returned values, I described here: https://dev.to/ninjin/exceptions-in-reactivity-23ca
image

This is the only way to ensure consistency of states under any circumstances.

@robbiespeed
Copy link

You can only do that if you maintain parent references. That creates significant compute overhead in .get(), and it adds potentially considerable memory overhead for each signal. Oh, and it complicates GC a bit since you have to make the references weak to prevent memory leaks.

This hidden complexity is something the proposal's been trying really, really hard to avoid. It always comes with both security and performance risks, and it also makes implementors question the value of it.

There are lots of different ways to implement a memoization function. You don't need to maintain a parent reference to the signal, you can pass it in to the memoized functions just like any other value. The memo internals could store the resulting value and inputs, which would automatically get GCed once the memozied function itself no longer has any references.

Asides from that, there's nothing wrong with using WeakRefs. This proposal will likely need WeakRefs in the native implementation, otherwise it won't be memory efficient, see #176.

@robbiespeed
Copy link

Well, why errors and promises should be cached in the same way as the returned values, I described here: https://dev.to/ninjin/exceptions-in-reactivity-23ca

@nin-jin This seems to be describing a eagerly reactive graph. In a lazily reactive graph what happens after a source changes is everything it depends on ends up in what's called a dirty state. Nothing gets recomputed until it gets accessed and it's been marked dirty. At which point an error being thrown would not put the application in a unstable state, but rather it can keep everything along the path as dirty (no error cache) or mark it as an error (error cache). There are other options like "Stop" which I think is similar to what I described here as option A.

Caching the error would mean that it would keep throwing the same error instance, until a source changes, at which point the computed could rerun on the next get.

Not caching the error would mean it will rerun the computed on the next get even if no sources have changed. If it errors again nothing gets cached and state remains dirty. If it doesn't error the result gets cached.

@dead-claudia
Copy link
Contributor Author

@nin-jin Your criticisms here would be much better suited for their own issue.

Also, your @mem, @mems, and @act could very easily be written in terms of signals and track/untrack. I'll leave out the implementation, but the first is basically a Computed-backed property, the second a Map<ID, Computed>-backed object using track/untrack to avoid memory leaks, and the third is just a Computed with dependent signals.

The goal of this proposal anyways is to provide the underlying dependency tracking primitives, not the high-level sugar primitives.

@dead-claudia
Copy link
Contributor Author

@robbiespeed The problem is, if I understand correctly, you'd be requiring the old computed value to be thrown away in .set(), and your parent reference only comes for free in .get().

If you're wanting to throw it away in .get(), that's already the behavior today, and my proposal here doesn't change that.

Will note that this is technically an implementation detail and an implementor could choose to do that optimization. The two behaviors are technically indiscernible to ECMAScript code assuming WeakRef never returns undefined and FinalizationRegistry never invokes its callback. And in general, the spec tries not to require optimizations, even though some are essentially expected by users.

@shaylew
Copy link
Collaborator

shaylew commented Apr 20, 2024

@robbiespeed Could you bring some of this discussion of cached vs. uncached computeds to #151 if it's the same topic, and start a new issue for the error-vs-value suggestions if you consider that a separate concept? I'm not sure what you mean with the caching vs. memoization discussion but it seems like it applies equally to the main proposal and to this issue's proposed tweaked API, so I don't want it to get buried in this thread.

@JosXa
Copy link

JosXa commented Apr 28, 2024

In this, if you set source to 2, display should not re-run, but if you set it to 1, it should. The intuition is that isEven, when recalculated,

didn't truly change
, even though its dependency source did. It's still the same value as before, and thus display shouldn't change.

In order to do this, you have to check it as part of parentComputed.isDirty. It can also impact uncached computeds. This is why this hypothetical Cached must be a primitive in order to function correctly.

This may be a naive throw-in, but the @preact/signals-core implementation uses an incrementing version counter to keep track of isDirty states: https://github.com/preactjs/signals/blob/main/packages/core/src/index.ts#L133

This could potentially act as the decider of whether to recalculate the sinks if isEqual returns true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants