Directions: In the question given below, a short paragraph is given. Select the answer choice that provides the correct sentence that completes the passage and is contextually and grammatically correct:
On Friday, Facebook banned Cambridge Analytica (CA). We have been talking about the role CA's uniquely-targeted advertising approach played in the 2016 US presidential election since just after the election. This much-more-recent ban occurred because of a breach of data management protocol (which broadly covers how data are obtained, transferred, and stored) and not because of the way those data were used. An academic researcher, Aleksandr Kogan obtained the data by asking users to opt-in to an app designed to estimate users' personalities from their pattern of behavior on Facebook. _______________(A)_______________.
Facebook found out about the break in data management protocol and requested that CA delete the data. CA agreed, but then Facebook found out from a whistleblower that they had lied, and so now CA is banned. _______________(B)_______________. Such prediction and targeting happens every day, anytime you engage in a behavior that can be linked to your identity, either online, through social media profiles that track individuals across websites by comparing email addresses or site cookies, or in the 'real world', with purchases made at different stores using different bank and credit cards being matched up by credit reporting agencies.
Most of this prediction happens in the background, with consumers rarely thinking about it, and consent for the collection and use of data exists in the fine print of user agreements that most of us click through without thinking. _______________(C)_______________. If a researcher were to infer political orientation by politicians a person supports, we would call that face valid data. That is, the measure (politicians supported) is clearly related to the thing we're trying to predict (political orientation).
What's less intuitive is that most - if not all - of your personal attributes can be guessed (even if imperfectly) by any information that is known about you. Measures do not need to be face valid to provide accurate estimates. If we can establish that one thing is consistently related to another, it doesn't matter if that link is obvious or causal. _______________(D)_______________. This is commonly referred to as an empirical, or bottom-up, or data-driven approach to measurement.
_______________(E)_______________.This is an example of the principle of aggregation: more data is always better, even if some or all of that data is of poor quality. Of course, you need less high-quality data to get the same accuracy of prediction; but if high-quality data might be suspect (for example, concerns about lying in direct, face-valid measures) or just flat out aren't available (for example, in-depth measures of millions of internet users), lots of low-quality data will do just fine.
1.
A. The problem began when Dr. Kogan chose to provide the data to someone else and that is why CA has been banned from Facebook not because they accessed and used the data, but because they didn't go through the proper channels to do it.
B. Aleksandr Kogan collected direct messages sent to and from Facebook users who installed his This Is Your Digital Life app.
C. A small number of people who logged into This Is Your Digital Life also shared their own news feed, timeline, posts, and messages, which may have included posts and messages.
D. In 2014, Facebook’s platform policy allowed developers to request mailbox permissions but only if the person explicitly gave consent for this to happen.
E. Kogan told the New York Times that he took messages only from people who had installed his app, not their friends and that none of the information was shared with Cambridge Analytica.
Answer - Option A
Explanation - Option B is incorrect because it talks Kogan's app and not about the Cambridge Analytica. The line following the blank talks about CA.
Option C is incorrect because it focuses on the app's role in collecting data but it does not describe the role of Cambridge Analytica. The line following the blank talks about CA.
Option D is incorrect. The statement talks about Facebook's policy, it does not talk about how Cambridge Analytica misused it.
Option E is incorrect because it is a statement of Kogan which is irrelevant as the line following the blank talks about deleting of data.
2.
A. The predictability of individual attributes from digital records of behaviour may have considerable negative implications, because it can easily be applied to large numbers of people without their individual consent and without them noticing.
B. But what's receiving the most attention is how those data were used and the extent to which seemingly innocuous online behaviors can be used to predict users' characteristics is shocking to most people.
C. The algorithm used in the Facebook data breach trawled though personal data for information on sexual orientation, race, gender – and even intelligence and childhood trauma.
D. A few dozen “likes” can give a strong prediction of which party a user will vote for, reveal their gender and whether their partner is likely to be a man or woman, provide powerful clues about whether their parents stayed together throughout their childhood.
E. Some results may sound more like the result of updated online sleuthing than sophisticated data analysis; “liking” a political campaign page is little different from pinning a poster in a window.
Answer - Option B
Explanation - Option A is incorrect. It talks about predicting behavior. The line following the blank talks about targeting that users are subjected to using their data.
Option C is incorrect. It gives an overview of the algorithm and does not explain predictions and targeting that the line following the blank talks about.
Option D is incorrect. It describes how likes can be used to predict behavior but it is irrelevant as the line following the blank talks about behavior which has digital records
Option E is incorrect. This statement is irrelevant as it does not relate to online or digital data.
Option B is correct as it describes the usage of data and digital records.
3.
A. In the real world, it’s not clear whether personality-based profiling would be better than the myriad other ways to target people that Facebook already provides.
B. The data set Kogan passed on could be politically useful whether or not it directly informed personality models.
C. We easily understand that something like political orientation may be guessed by seeing that a person likes or follows certain politicians or organizations.
D. Kogan suggested the exact model used doesn’t matter much, though – what matters is the accuracy of its predictions.
E. Likes provided a cheap, easily accessible behavioural record of billions of people all in one place, conveniently formatted for machine analysis
Answer - Option C
Explanation - Option A is incorrect. It talks about personality based profiling and Facebook. It does not talks about political orientation.
Option B is incorrect as it is ambiguous as it talks about models.
Option D is incorrect as it talks of model and predictions which is irrelevant contextually.
Option E gives a general idea about likes but it does not relate to the line after the blank.
Option C is correct as it talks about the political orientation which is what the line following the blank talks about.
4.
A. It is difficult to establish a correlation with weak data sample.
B. Data can be mined by the companies at a cost which is trivial when compared to results.
C. It is irrelevant whether there is a link between the data and the behaviour.
D. All that matters is that link does exist, and now we can use it to make predictions.
E. Establishing a link is only half the job, it is important to check its accuracy as well.
Answer - Option D
Explanation - Option A is redundant as it talks of correlation while the preceding line talks of consistency.
Option B is irrelevant as the line preceding the blank talk about the quality of data and predictions and not data mining.
Option C is incorrect as it contradicts the argument made by preceding lines.
Option E is incorrect because the preceding lines talk about accuracy and its significance.
Option D is correct as it talks about the link between data and behavior.
5.
A. The model which processes the data should not be vulnerable to data fluctuations.
B. Data becomes irrelevant after a point because of constant change in behavioural patterns.
C. Data quality may affect the accuracy of the predictions to a great degree.
D. A huge amount of information is needed to make deductions as the quality of data is poor.
E. Putting together a lot of these weak pieces of information allows us to make valid inferences.
Answer - Option E
Explanation - Option A is incorrect as it talks about a model but the line following the blank talks about aggregation.
Option B is incorrect because talks about data becoming irrelevant which is not related to aggregation.
Option C is incorrect as it talks about the quality of data which is not related to aggregation.
Option D is incorrect because it talks about the quantity of data which is not related to aggregation.
Option E is correct. Aggregation means the collection of related items of content so that they can be displayed or linked to.