Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardized Social Web Portability Fileformat #190

Open
upintheairsheep opened this issue Nov 28, 2024 · 2 comments
Open

Standardized Social Web Portability Fileformat #190

upintheairsheep opened this issue Nov 28, 2024 · 2 comments

Comments

@upintheairsheep
Copy link

upintheairsheep commented Nov 28, 2024

Introduction

It's nearly impossible to migrate automatically from one social platform to another. You can't just transfer all your tiktoks to youtube shorts or instagram reels, you can't just move your tweets to facebook, etc. Additionally, it is required under the GDPR and CCPA.

Use Cases (Recommended)

Very common

Goals (Optional)

Provide a common file format for social media post streams, consisting of all major forms of social media like video platforms, and microblogging platforms. Common fileformats for activity data (watch/view history, likes, location history, etc) should be considered as well with concerns for allowing algorithms to learn about you. A common fileformat for chats should be added as well supporting both social media chats and regular SMS/MMS/RCS chats.

Non-goals (Optional)

A list of aspects of this challenge/problem or related challenges/problems that this proposal does not need to address.

Proposed Solution

See https://www.w3.org/2024/09/25-data-portability-minutes.html

Examples (Recommended)

Share one or more examples of how your proposal would be used by developers.

Alternate Approaches (Optional)

Not thought of yet.

Privacy & Security Considerations

Only security concerns are using it to spread spam, but it should be more of a universal export format that platforms like mastodon and bluesky can import, if spam is a concern, the data tranfer initiative's framework could solve it by preventing spam.

Let’s Discuss (Optional)

A framework for major social companies using more data, like viewcounts and likecounts from the old account and allowing the target to read it would be good, but should prevent spam. As an example transferring videos from YouTube to LibertyVids (fictional platform) should take into account view counts and like counts on the previous platform (Youtube) and accordingly suggest the high-viewcount and high-rated ones to more people. When uploading it as a file, anyone can just go and change the view and like counts to a literal googol and submit it, a framework powered by the data transfer foundation should be able to transfer stuff from one to another while respecting the fact that not everyone's phone can fit a potentially gigantic zip file.

See https://www.w3.org/2024/09/25-data-portability-minutes.html , and my reply for it is Hello, I've read your presentation, a few weeks ago I emailed the W3C with the same problem. There's MBOX for emails, VCF for contacts, ICS for calenders, and Bookmarks.html for bookmarks, but there should be standardized formats for social media content, online activity logs such as view, like, and search history, and chats. The TikTok ban is my main concern, if there's no way to import all of my TikTok data into IG or YT Shorts, my 20,000 follower account and millions of views, and hundreds of content are all down the drain. I've seen work with microformats, that never got into fruition since the late-late 2000s. Since most work is on social media, there should be work on related data that goes as part of a social network: chats, photo libraries, and settings formats. For post history, I've considered either adopting the fediverse CSV format or creating a new JSON or XML format. It is important to note that BlueSky (AtProto) unfortunately exceeds Mastodon and AcitivtyPub now. You decide the format, but it should be an open format, all data the social site has should be added to the rows to comply with laws like the GDPR and CCPA, where even the slightest missing setting could result in noncomplience issues if someone decides to report it. The system should be editable on GitHub and provide ways to extend it, but not have it end up like XMPP. The system should also allow direct comments to other social media platforms, for example if you replied to a Twitter post or commented on a TikTok video, it should be transferable to the destination, with it's original link, author, and possibly title. It would appear as you replied to a "ghost post" that appears as a system message.

I also propose some universal chat export/import format for PMs and messages, as it is a huge part of social media. The system should be a JSON or XML with extension system mentioned earlier. The system should include support for the text itself, timestamps, reactions, stickers, attachments, and everything else. Group chats should also be supported The recipient should be @[email protected] (fediverse format) if transferring for example from SnapChat to WhatsApp, or any other combo, however if it’s a phone number it should be “+000000000000” or “00000000000” or “0000000000@tel” The universal chat format should also have a controller as well as the sender and reciver(s), as some messangers like whatsapp and snapchat use those messages to refer to stuff like people joining a group, screenshot detection, and more. For example if someone moves their Roblox chats to Discord, the roblox chats will be available as an archive and you can see the whole chat, but you cannot still send messages, you have to merge them into it's equivalents. However see the follower list section for a good solution.

I propose a format for activity logs such as watch history, search history, like history, dislike history, etc. Kinda inspired by https://developer.apple.com/documentation/SafariServices/importing-data-exported-from-safari , it should either be a json or csv, with a field for what type of history, and for social media content, watch history, like history, dislike history, and more should be URLs pointing to the content to make it suitable to the app. It should have the title/description and author/curator and possibly the category of the content so that it can be digestible by other algorithms, this is especially important considering the goal of social medias to serve personalized content, if a user moves to Youtube shorts from tiktok after a possible ban by the feds of any country, it can be fed into youtube shorts or instagram, who will then know what you watch and give personalized content. However search history can be plain text searches rather than URLs. IP activity logs/login-logoff records can also be included for telling algorithms when you get in and out of work or school, for example, and location history in this format is useful considering how snapchat keeps the data, and it can be uploaded and transferred to Google for adding onto it's location history data.

Settings should also be part of this format, for notification and privacy preferences for example. an example can be a setting for the following: limits.timeLimit = 2 can be the format that transfers the 2 hours per day preference between different platforms, it is important to get each preference mapped out in most major social platforms to it's equivalents in other platforms. Block lists are not included as they should be in the same format as the following lists. Profile Pictures and banners should also be transferable, including profile picture history. another easy one can be basicInfo.pronouns = he/him which can easily transfer.

Following lists should be a simple csv or json file, and also include social media connections, they should contain the usernames, human-readable names (ex. "Zach D. Films"), URLs/websites, emails, phone numbers, and bios/descriptions. Not just the original platform username, but also include the connections. Many social media platforms, including TikTok, Discord, Roblox, and others provide a way for people to link their other social media accounts to the platform. As an example a tiktok profile, @example, can be @[email protected];@[email protected];@[email protected] , as you can link instagram and youtube profiles to your tiktok account, the user @example has two other usernames which could not be detected otherwise. If someone following @example moves to one of those two platforms, the follow will be transported with it, and if they move from TikTok to X/Twitter, they might suggest other profiles making it more likely to find the account. Block lists can be the literal same format but with a different filename.

Also consider adding support for previous stories and ephemeral-text (like Twitter Fleets) history, and photo galleries like Snapchat Memories.

Finally take feedback from GitHub to add missing data or stuff to it.

Additionally, try to have all W3C members from Apple to Zoom, especially Google, develop open-source scripts to convert their exports into whatever formats takes the lead, including their discontinued products like Google+, Reader, Buzz, and more.

I personally remember in 2019, I could not import my Google+ export in any manner, I was stuck with some random ZIP file that will never be able to furfill it's purpose (until a converter comes out), as MeWe's solution supported JSON exports only, not HTML exports. So this issue personally resonates with me.

Also see https://ictinstitute.nl/wp-content/uploads/2023/12/Master_Thesis_Data_Portability_SNS_JBenistant_FINAL.pdf and http://microformats.org/wiki/social-network-portability

@csarven
Copy link

csarven commented Nov 28, 2024

@upintheairsheep
Copy link
Author

Perhaps comes closest to what you're looking for?

Sort of, but it's more of a file format, not a protocol, but it can for sure be based on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants