How to see who is holding what of your data...
Facebook sends you an HTML collection of various items, some useful and some not. You download a ZIP archive. There is a summary of your profile, a collection of your posts to your timeline, a list of all of your friends (including those who have left Facebook) and when you connected with them, and any videos and photos that you have posted. Two items that are worth more inspection are a list of advertisers that have your information: I noticed quite a few entries to more than a dozen different state chapters of Americans for Prosperity PACs that are funded by the Koch brothers. Finally, there is a list of your phone’s contacts that it grabbed if you ran its Messenger application, which it justifiably has been getting a lot of heat for doing. Note that this is different from your friend list.
LinkedIn sends you a ZIP collection of CSV files that you can open in separate spreadsheets that contain different lists. There are your contacts (which they call your connections), your messages that you have exchanged with other LinkedIn members, recommendations that you have made and have been sent to you, and other items. Most of the files contained just a single line of data, which made looking at all of them tedious. LinkedIn actually sends you two collections of files: you should ignore the first one (which you get almost immediately) and wait for the “final” archive, which is more complete and arrives several hours later. Most of this data is rather matter-of-fact. One file contains a summary of your profile that is used for ad targeting, but there is no list of advertisers like with the other networks. Another file contains the IP addresses and dates of your last 50 logins, and another contains the dates and names of people that you have searched for on the network. What bothered me the most about my list of LinkedIn connections was the number of them differed by two percent from what is displayed on my LinkedIn home page and in the spreadsheet itself. Why the difference? I have no idea.
Google operates somewhat differently and more opaquely than the others mentioned here. First, you go to the link above, which is a separate service that will collect your Google archive. The screen shot shows you just some of the dozens of different Google services that you can select to use in the gathering process. In my experiment this process took the longest: more than three days, whereas the others took minutes to several hours. Even before you get your archive, scanning this list and selecting which services you want included in your report is a depressingly lengthy activity. When I finally got my archive, it spanned three ZIP files and more than 17GB in total, which is more than all the others combined.
However, that is just the beginning. When you bring up a web page that shows the various Google services, you have to separately extract the data for each service individually and each service uses it own data format that you then need to view in a particular application: for example, your calendar items are in iCal format, your email data is in MBOX format, and others are extracted in JSON format. Analyzing all this information can probably take a data scientist the better part of a few days, let alone you and I, who don’t have the tools, dedication or time. If you are thinking of de-Googling your life, you will have to do more than just switch to an iPhone and give up Gmail.
But wait, there is more: emails that you delete or find their way into your Spam folder are still part of your archive. In the Googleplex, everything is accounted for. Note that if you have uploaded any music to Google Play Music, this data isn’t part of your archive and you’ll have to download that separately.
Twitter will send you two files: one that is a PDF attachment that contains a list of all the advertisers that have your information, but the advertisers’ names are shown in their Twitter IDs and thus not very meaningful. The second document is an Html collection of all your tweets, and you can bring up your browser or access the data via in two formats: JSON and CSV exports by month and year. Notice that there is nothing mentioned about downloading all of your Twitter followers: you will have to use a third-party service to do this. One thing I give Twitter props for is that you have a very clear series of settings menus that might be useful to study and change as well, including connected apps and privacy settings. Facebook and LinkedIn constantly are rearranging these menus and make changes to their structure and importance, which makes them more difficult to find when you are concerned about them. But Twitter at least give you more control over your privacy settings and tries to make it more transparent.
Hat tip to http://blog.strom.com/wp/?p=6497
Posted: Friday 27 April 2018