-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Perseus data #3
Comments
I'd be curious to see what an SML development of these ideas would look like. Perhaps modules would give a good abstraction for each level that we want to interact with the language (e.g., letter, syllable, word, clause, etc.)? |
We probably should add the Perseus repos as submodules. Adding a test to verify successful loading of the data would be helpful. |
@scott-fleischman Thanks for the added detail! As you mention, ML modules might be a good fit for the kind of abstractions you're talking about. On the other hand, it feels like Haskell's laziness could be a big win when considering "the whole Perseus corpus" as a big datastructure that we want to query into... |
I wonder if it would be worthwhile to create a separate Haskell library that provides access to the Perseus data. |
It may be a good idea; I'm afraid that during the next week, I will probably not have time to look into this. So if anyone else wants to take a look, you won't be stepping on my toes. |
I won't be getting to it in the next week either; maybe in the next month or two it will become useful as we work on our own from-scratch morphological analysis. |
One thing I would like to find out sooner than later is whether LSJ is available in the data. |
Load XML from https://github.com/PerseusDL
Data we may want to extract:
I imagine we want to be able to access the data:
LSJ: Is the full work available here? If so, we will want to extract more precise information from it for lemmas, principal parts, dialect forms, etc.
The text was updated successfully, but these errors were encountered: