Privacy And Power: Your Digital Fingerprint (Part 1) | NBC Nightly News

Channel: NBC News
Published: 12/11/2019 08:48 PM

Everything you do online generates a record. And there are many silent third parties lurking in the background, collecting your personal information and exchanging it with other third parties while you browse the web. In this first installment of “Privacy and Power,” we visit leading computer sc...

Data is the new oil, it's the new gold, it's something that companies used and increasingly want to have, because it helps marketers figure out how to get you to buy things that you may or may not actually want everything we do. Even this conversation, everything has a record everything we do generates a digital trail. We leave those little crumbs everywhere and there is always someone there to pi ...
k them up, because this data crumbs worth money very few services, are run out of the goodness of the creatives heart. One of the easiest ways to make money from someone using your tool is to collect information about them and either sell that directly or use it toshow ads to those people, and this leads to the collection of very detailed dossiers about you individually. The things that you do, the things that you enjoy and the things that you are most likely to purchase, but that's just the tip of the iceberg. What enables those targeting is, what i'll call a surveillance infrastructure - and i use that word surveillance specifically, because there is a power asymmetry. These companies have so much data about individuals, but individuals, don't really know. What'S going on the kind of perverse thing about all of this, is that there's there's this whole surveillance infrastructure and it's not very good either, and there are a couple studies that have been done. Thathave shown that the very you know, highly targeted advertising doesn't really generate that much more revenue either, but then it's also you being used for other things beyond ads that most people are not aware of. I kind of see all of this data collection analogous to pollution. Back in the 60s, you know we didn't know what the effects of pollution were and the environmental movement was just getting started and people were starting to recognize what were some of the ways in the future that all the ways that we were polluting, our atmosphere and Our environments could come back to bite us. I think we're in that moment now, for privacy, we're very gradually figuring.

Outwhat are the ways in which this data can be misused, how it can fall into the hands of adversarial actors, how it can be used to target and manipulate and persuade people. The russian government interfered in our election in sweeping and systematic fashion. Is it healthy for a democracy to go? Have people received very different, very targeted ads that are very niche, exactly the hot-button issues that trigger them, while ignoring the other substantive information that's out there is that healthy? There is no comprehensive federal privacy or data security legislation. The commission has called on congress to enact comprehensive privacy and data security legislation, and it remains to be seen whether or not that willoccur, but at least having some rules around it and some protections that benefit us all. I think way overdue because we've we've let companies run rampant in this face and as they've shown time and time again they cannot be trusted and it was my mistake and i'm sorry, i started facebook. I run it and i'm responsible for what happens here: [ music ], we are entering the information age and the internet is a great place to move tons of information at the speed of light. The fastest growing part of the internet is the world wide web. An area that combines text, graphics and pictures and permits people to hop from place to place easily when we wereworking on the web in the 90s we weren't thinking ahead too far we were thinking about. Can i make the web useful people so the early web was text and hyperlinks? That'S why it's called hypertext. You would follow a blue link that was annotating a phrase or sentence or a word, and you would click on it. You'D go to a new page. It was great you could go back, you can find things pretty soon.

People got tired of just text and they wanted images as well and that led to an embedded element you could put in your page, but it referred to a different machine by its internet address. It referred to your friend's catpicture server because he wanted to put the cute cat picture on your page. Why not? That was done in 1993 and then in 1994 the cookie came along and you could think of that as a way of saving some information in your browser for every site you visit, it was meant. So when you go to your bank and you sign in you would not have to sign in every time he went forward or back or started your browser over, because the basic protocol of the web was not designed to remember who you were it's so called stateless Cookies helped the server set some state in the browser between the cookie and the image you hada tracking system, without knowing it - and we did this naively - we were saying, let's make the web useful, let's put images in so that people can share cat pictures and seem To make them appear to be on their own pages when they're really coming from that other machine. Let'S make cookies, you don't to log in all the time, essentially, every click, every search that we do. Every online interaction is recorded and now just recorded by the company that you interacted with the creepiest part of this is that it's being recorded by dozens hundreds, perhaps even thousands, of so-called third parties online by first party. I mean the website that you think you're visiting and athird party is any other entity. That'S on that site and it's typically not visible, you're, typically, not aware you're interacting with them. You see the web page that loads. You don't see any of the details of that information being shuttled back and forth or what they're learning about you. So what we've been looking at here at princeton is the kind of data collection that happens when you browse the web and the way we've done. That is my grad students and postdocs, and i have filled a bot, an automated computer program that pretends to be a real user and browses the web.

It looks at the web's top 1 million web sites every month, but it'sespecially, looking at the things on those web pages that a human user would not notice the cookies and what are called fingerprints and various other tracking technologies. So i started open wpm to visit the top hundred web sites, these loads, each web site on a different window and then collected data about tracking related practices like cookies or scripts. What kind of data has been accessed and transferred? All these are logged into a database. Most commonly they will just place a cookie with a unique identifier so that they can track you across web sites in the experiments where we enter personal information, such as email and password. As soon as you type like email address, itwill be collected on santa a third party. In other cases, the website fill fingerprint. Your browser's cookies are like writing an id on a name tag. The website says here's the idea. I want you to use you slap that name tag on and you can take it off whenever you want. You can clear your cookies. Your web browser can refuse cookies from certain sites. Fingerprinting doesn't work the same way.

Fingerprinting is about recognizing you recognizing the small differences that set your web browser apart from other web browsers. This is sort of the way that, if you go to order coffee at the same shop, the barista, who may not know your name until you giveit to them, may recognize your order. They recognize how you dress. They recognize your hairstyle. They are identifying in the sense that it lets someone pick you out of the crowd. The way it works is that those companies is third-party. That'S hidden on the site sends a sequence of commands to your browser that causes it to draw an image and hit an image that you would not know as being drawn on your screen, but the way that you browser is gon na interpret those commands. It'S gon na be different, based on the version number based on various other things, even based on what is the set of fonts. You have installed on your computer when a websitesays, hey, try to draw this line. Draw this curve draw this shape, render this image your web browser goes, oh graphics, editing, that's best suited for a graphics card, and it has that over to your graphics card. So it draws this picture and it draws it with so much attention to detail that if you ask any other graphics card running any other version of software running on any other computer to do the same thing. It'Ll look just a little bit different.

So if you ask the same web browser to draw the same picture again, you'll be able to remember who it was who drew the picture with exactly those pixels in exactly those places. Itlooks, like a biometric for the devices like a fingerprint, but for your for your browser. So then, even if you remove or clear your browsing history, you can still be tracked. Crips or the third parts to be found are very like creative in like what kind of images they make your browser draw. In that case, for instance, they use a string which contains all the letters of the english alphabet so that that maximize diversity and uses like different colors and shapes in the background here, for instance, we see different images used by the cameras finger printers. You see not only like some text, but also some like shapes interesting shapes, so chemists finger, painting, ispretty common right now, for instance, here facebook come on page actually makes your browser draw this like interesting security with smiley's and collects it to fingerprint your browser. Some developer came up with this extension, which visualizes the the chemists fingerprinted collects from your browser and again this is an invisible drawing. You wouldn't know that it's going on, but by drawing that image and then reading it back as a sequence of pixels, it's gon na get an exact sequence of pixels back that's gon na differentiate a different users different devices. Some machines could be exactly identical. If you deliberately set them up the same way, but your machine is probably unique amongst all the machines that are beingcompared by this website today over this time period. And it's enough to add you back to the dataset, recognize that you're the same person who they saw on that website or perhaps use information whose name whose address whose phone number is already known through some other data broker. Recently, in the last year, or two one thing we've been investigating is how third parties on webpages can in a very blatant way, without resorting to cookies fingerprinting, none of that stuff, just gobble up your personal data from websites.

Of course, like it all the website, you visit how much time you spend on them you're like a mouse movements. Tea presses what part of the page youinteract with all these are sent in real time on websites that use session replay scripts and a session replay is something like a video recording of your browser screen when you're browsing a website, and this video recording is done not By the company that you're actually, whose website you're looking at it's done by a third party they're, not reputable companies that you might have heard of you don't know who they are. You don't know that company is recording your screen. You don't know who's gon na get access to that videos and in that process these companies try to redact your passwords and credit card numbers and so on so they're to a certain degree. They'Re tryingto do the right thing, but that redaction doing it in an automated fashion is technically hard, i would say, essentially impossible, so they fail a lot of the time. So we've found that credit card information, health information like your drugs and prescription student information. All of that sensitive information is getting into the hands of third parties. There are a number of extensions. You can install a near browser to protect yourself against this kind of sneaky tracking privacy, badger gostrey's, you block origin, you install any one of these they're gon na. Do a pretty good job of protecting you for most, certainly not all, but most of the hidden third party tracking, so in mobile apps thesame type of tools don't exist by default. You know your mobile device does not come off the shelf with the tools sufficient to see where data is being sent from the apps that you install. There are some, you know, base level of privacy controls on mobile apps, but we found that by and large they don't actually work in many of the cases they don't actually do the things that they're, you know supposedly doing.

[ music, ], hey nbc news fans, thanks for checking out our youtube channel subscribe by clicking on that button down here and click on any of the videos over here to watch. The latest interviews show highlights and digital exclusives thanks for watching.

Watch Next