Images behind the generated images from Stable Diffusion

People have been having fun with the text-to-image generators lately. Enter a description, and the AI churns out believable and sometimes detailed images that match the input. The reason these systems work is because the models were trained on a lot of data, in the form of images. Andy Baio and Simon Willison made a tool to browse a subset of this data behind the recently released Stable Diffusion.

Tags: , , ,

Incomplete crime data

When the FBI switched to a new data collection system, which relies on local police departments to report their numbers, about 40% of agencies didn’t switch over. The Marshall Project made an interactive to see who’s reporting data in your state:

Many criminologists fear the missing data means the nation would not get reliable crime data for years to come. If a local police department did not report crime data to the FBI, it would also mean scholars, policy makers and the public cannot compare what’s happening with crime in their community with other places.

Tags: , ,

Where the data from your car flows

Jon Keegan and Alfred Ng, for The Markup, identified 37 companies that collect data from connected cars. On where it goes and how the companies profit:

Once a driver gets into a car, dozens of sensors emit data points that flow to the car’s computer: The driver door is unlocked; a passenger is in the driver’s seat; the internal cabin temperature is 86° F; the sunroof is opened; the ignition button is pressed; a trip has started from this location.

These data points are processed by the car’s computers and transmitted via cellular radio back to the car manufacturer’s servers.

As the trip continues, additional information is collected: the vehicle location and speed, whether the brakes are applied, which song is playing on the entertainment system, whether the headlights are on or the oil level is low.

The data then begins its own journey from the car manufacturer to companies known as “vehicle data hubs” and on through the connected vehicle data marketplace.

Tags: , ,

Period trackers and legal implications

Given the current restrictions in the U.S., Kendra Albert, Maggie Delano, and Emma Weil discuss data privacy for those who track their periods:

In their investigation, police try to find evidence that someone intended to miscarry, or was otherwise endangering the viability of their pregnancy. This is because a medical abortion presents the same way as a miscarriage, and prosecutors must prove intent or willful endangerment of an embryo or fetus in order to convict someone (though being arrested at all is traumatizing and can cause severe health consequences). Prosecutors must be able to prove their case beyond a reasonable doubt — data from a period tracker app is not enough on its own to prove this, even if it’s relevant.

I think there’s understandably been nervousness around tracking your period, but it seems that from a legal perspective, there’s little risk? Albert, Delano, and Weil also recommend privacy-centric apps and discuss the more technical aspects in a companion article.

Tags: ,

Facebook doesn’t seem to fully know how its data is used internally

Lorenzo Franceschi reporting for Motherboard on a leaked Facebook document:

“We do not have an adequate level of control and explainability over how our systems use data, and thus we can’t confidently make controlled policy changes or external commitments such as ‘we will not use X data for Y purpose.’ And yet, this is exactly what regulators expect us to do, increasing our risk of mistakes and misrepresentation,” the document read. (Motherboard retyped the document from scratch to protect a source.)

In other words, even Facebook’s own engineers admit that they are struggling to make sense and keep track of where user data goes once it’s inside Facebook’s systems, according to the document. This problem inside Facebook is known as “data lineage.”

Hm.

Tags: , ,

Tracking the CIA to demo phone tracking

Sam Biddle and Jack Poulson for The Intercept reporting on Anomaly Six, a company that knows a lot about a lot of people through phone data:

To fully impress upon its audience the immense power of this software, Anomaly Six did what few in the world can claim to do: spied on American spies. “I like making fun of our own people,” Clark began. Pulling up a Google Maps-like satellite view, the sales rep showed the NSA’s headquarters in Fort Meade, Maryland, and the CIA’s headquarters in Langley, Virginia. With virtual boundary boxes drawn around both, a technique known as geofencing, A6’s software revealed an incredible intelligence bounty: 183 dots representing phones that had visited both agencies potentially belonging to American intelligence personnel, with hundreds of lines streaking outward revealing their movements, ready to track throughout the world. “So, if I’m a foreign intel officer, that’s 183 start points for me now,” Clark noted.

Tags: ,

Tax services want your data

Taxes are due today in the U.S. (yay). Geoffrey A. Fowler for The Washington Post on the part when tax services like TurboTax and H&R Block ask for your data:

What he discovered is a little-discussed evolution of the tax-prep software industry from mere processors of returns to profiteers of personal data. It’s the Facebook-ization of personal finance.

America’s most-popular online tax-prep service, Intuit’s TurboTax, also asks you to grant it additional access to the data in your return to “enrich your financial profile, communicate with you about Intuit’s services, and provide insights to you and others.”

[…]

The good news is because of Internal Revenue Service rules, this is one data request you can actually say “no” to while continuing to do your taxes online. And if you already clicked “agree” and now have changed your mind, there are some steps you can take, too.

Tags: , ,

Recontextualized media

The Media Manipulation Casebook summarizes how bad-intentioned people take media from past events, movies, and video games and shove the bits into a different context to fill a different purpose:

Posts with recontextualized media often take advantage of short, less than one-minute video clips that lack much context about where the video originates. One 19-second video clip posted to TikTok on February 24, 2022 depicts two paratroopers mid-flight before switching to a selfie of a man speaking in Russian. The post claimed to show troops descending on Ukraine. One of the posts of this clip received over 1 million interactions on TikTok and was shared across Instagram and Twitter. The short clip was not from 2022, but rather can be traced back to a 2015 Instagram post that had no caption, according to a fact check by Reuters.

Tags: , , , ,

Visual forensics to spot fake videos and photos

It’s easy for anyone to grab a picture or video and claim that it shows something that it doesn’t. This is problematic during times of conflict, when accuracy is especially important. For The Washington Post, Elahe Izadi describes how journalists separate real from fake:

The process begins with geolocation: pinpointing exactly where an image was recorded on a map, which Willis calls the “the bread and butter” of verification. “We’ll never publish a clip in our blog updates or tweets if we haven’t located it,” she said.

For that, forensic journalists dissect scenes pixel-by-pixel, looking for landmarks, silhouettes and other details, and cross-referencing images using free tools such as Google Earth or the Russian equivalent, Yandex, as well as satellite subscription services. They might also compare several videos of the same incident to unlock more clues. Sometimes something as small as a tile pattern on a roof can hint at where something took place.

Tags: , , ,

Family safety app sells location data to third parties

Life360 is a service that lets families keep track of where members are based on phone location data. For The Markup, Jon Keegan and Alfred Ng report on how Life360 then sells that data to third parties for millions of dollars:

Through interviews with two former employees of the company, along with two individuals who formerly worked at location data brokers Cuebiq and X-Mode, The Markup discovered that the app acts as a firehose of data for a controversial industry that has operated in the shadows with few safeguards to prevent the misuse of this sensitive information. The former employees spoke with The Markup on the condition that we not use their names, as they are all still employed in the data industry. They said they agreed to talk because of concerns with the location data industry’s security and privacy and a desire to shed more light on the opaque location data economy. All of them described Life360 as one of the largest sources of data for the industry.

You kind of expect this from a free app, but Life360 is a paid service that collects children’s location data. Seems questionable.

Tags: , ,