increase the efficiency of PBT


How to increase the efficiency of Predictable Behavioral Targeting?


In the context of developing the fraud DB and a social networking project, the idea of combining those topics came to my mind. There is a possibility, to increase the performance of PBT algorithms due to the analysis of social networks. That may sound a bit weird in first place, but let me explain it.


What is Predictable behavioral targeting?


Predictable behavioral targeting (PBT) is used in the affiliate and marketing industry as well as in the security industry in order to forecast the most probable upcoming behavior of a user or target. This goes back to the topic of probabilities in the scientist field of calculus. I already described a technique called Collaborative Filtering which is implementing the topic. Collaborative Filtering works with predefined groups. The system has enough information about a group of users and if a new user appears, it tries to match parameters in order to match this user in its best fitting group.


And what about this social networking thing?


The interests of today's world are changing rapidly. 5 years ago instant messenger like ICQ and MSN were used all over the world, today everyone has a SMS Flatrate or uses WhatsApp. It seems like the interests of the people changes over time, hence the equivalent matching groups in the Predictable behavioral targeting pattern have to adjust to these changes. For many it is uncertain or they neglect this aspect. So it can occur quite possible that after 4-6 years the accuracy of a system decreases by 50%. Accordingly, the predefined groups must be adapted restive. One way to make that happen is by using a learning system, but what if our system now learns from wrong false or incorrect mappings (also considered as Type I and type II errors)?

A possible starting point would be to make use of the information provided by social networks. In the American culture, the so-called liking of relevant pages is quite large. On average, each U.S. user likes 10 pages per day. If one is able to access these data, it is easy to refresh or rather update the target groups and in addition, it is also possible to refine the mapping between them. This requires a simple matrix analysis of the collected data.


And how do you want to get these data from social networks like facebook?


Thanks to the lax privacy practices and the open mentality in the U.S. it is relatively simple. Most pages that are liked are publicly available. It only requires a simple script that calls the page and collects the corresponding data. To be honest, the "hardest" part is the crawler function - so the problem is how do I get my next interesting site.


Where could be a possible starting point for such a crawler?


The topic of interpersonal relationships or rather the network of acquaintances and interests is best represented in so called “traffic exchange networks” or “follower exchange network”. The sense of these platforms is to generate a high amount of likes, followers and subscribers in social networks like facebook, twitter and so on. Like you can read in this article from IBN Live (Global Broadcast News) the topic of exchange networks become such a big deal during the last years, especially during the last election campaign in the US. In order to accomplish this goal, the webmaster publish their website on one of the network platforms and pays for the display of a link. In most cases huge companies or political parties, who want to push their public relations, advertise their links. Just start to crawl them and you will have a nice and long starting list for your crawler



After you were able to collect a bigger amount of data, please be sure to anonymize them. It is not just about the privacy topic but rather that no one want to be named in that matter. Not you, not me, nobody. So please respect that. Hereafter you can use these information as predefinings for the matching groups. The rest of the procedure is followed by the well-known process of collaborative filtering.


Jani Podlesny

Head of Engineering

I am focusing on Data Architecture and Analytics for Management Consulting across EMEA and the US. For my passion in Data Profiling & Privacy I am doing a PhD research at the Hasso- Plattner- Institute. 

Berlin Lab

Berlin, Germany
  032 229 340 927
  This email address is being protected from spambots. You need JavaScript enabled to view it.

Wuerzburg Lab

Würzburg, Germany
  032 229 340 927
  This email address is being protected from spambots. You need JavaScript enabled to view it.