(Created page with "Ticketmaster Bot Detection System ## Goal Automated bots are a threat to the TicketMaster user experience. How can TicketMaster identify “true fans” of an artist or band? How can TicketMaster distinguish bots? ## Data We were provided with three datasets. The first captured user clickstream data over the past 12 months, showing how customers navigated through the TicketMaster website. The second summarized advertising campaigns run via Google, and the third detai...") |
mNo edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Ticketmaster Bot Detection System | == Ticketmaster Bot Detection System == | ||
==== Goal ==== | |||
Automated bots are a threat to the TicketMaster user experience. How can TicketMaster identify “true fans” of an artist or band? How can TicketMaster distinguish bots? | Automated bots are a threat to the TicketMaster user experience. How can TicketMaster identify “true fans” of an artist or band? How can TicketMaster distinguish bots? | ||
==== Data ==== | |||
We were provided with three datasets. The first captured user clickstream data over the past 12 months, showing how customers navigated through the TicketMaster website. The second summarized advertising campaigns run via Google, and the third detailed event-level information including genre, location, and time. | We were provided with three datasets. The first captured user clickstream data over the past 12 months, showing how customers navigated through the TicketMaster website. The second summarized advertising campaigns run via Google, and the third detailed event-level information including genre, location, and time. | ||
Cleaning and merging the data was the most time-intensive task. Ultimately, we engineered a dataset with one row per unique customerID-concertID-concertGenre-DateTime combination. We treated each customer’s history as a sequence of nodes, where each node represented a concert they attended. If a customer bought tickets to both a jazz concert and a country concert, that created an edge between the two corresponding nodes. Users who attended only a single concert didn’t contribute any edges to the network, as no co-attendance relationships could be inferred. | Cleaning and merging the data was the most time-intensive task. Ultimately, we engineered a dataset with one row per unique customerID-concertID-concertGenre-DateTime combination. We treated each customer’s history as a sequence of nodes, where each node represented a concert they attended. If a customer bought tickets to both a jazz concert and a country concert, that created an edge between the two corresponding nodes. Users who attended only a single concert didn’t contribute any edges to the network, as no co-attendance relationships could be inferred. | ||
==== Hypothesis ==== | |||
We hypothesized that network topologies would differ meaningfully between large and small spenders. Specifically, we expected large spenders (>$600/year) to behave more like generalists—crossing genre boundaries frequently—while small spenders (≤$600/year) would show more genre loyalty, resulting in tightly clustered, genre-specific subnetworks. | We hypothesized that network topologies would differ meaningfully between large and small spenders. Specifically, we expected large spenders (>$600/year) to behave more like generalists—crossing genre boundaries frequently—while small spenders (≤$600/year) would show more genre loyalty, resulting in tightly clustered, genre-specific subnetworks. | ||
==== Findings ==== | |||
Our network analysis confirmed this hypothesis. In the small spender graph, we observed distinct genre-based clusters, suggesting these users gravitate toward specific music preferences. In contrast, large spenders produced a much denser and more interconnected network, with fewer clearly defined genre-based groupings. This supports the notion that high spenders are less selective—likely motivated by resale value rather than personal taste. | Our network analysis confirmed this hypothesis. In the small spender graph, we observed distinct genre-based clusters, suggesting these users gravitate toward specific music preferences. In contrast, large spenders produced a much denser and more interconnected network, with fewer clearly defined genre-based groupings. This supports the notion that high spenders are less selective—likely motivated by resale value rather than personal taste. | ||
These insights point toward a potential strategy for identifying “true fans”: users whose concert-attendance behavior forms tight genre-specific clusters are more likely to be genuine fans rather than resellers. TicketMaster could leverage this behavioral signal, among others, to inform fraud detection systems and prioritize access for actual fans. | These insights point toward a potential strategy for identifying “true fans”: users whose concert-attendance behavior forms tight genre-specific clusters are more likely to be genuine fans rather than resellers. TicketMaster could leverage this behavioral signal, among others, to inform fraud detection systems and prioritize access for actual fans. |
Latest revision as of 14:23, 19 April 2025
Ticketmaster Bot Detection System
Goal
Automated bots are a threat to the TicketMaster user experience. How can TicketMaster identify “true fans” of an artist or band? How can TicketMaster distinguish bots?
Data
We were provided with three datasets. The first captured user clickstream data over the past 12 months, showing how customers navigated through the TicketMaster website. The second summarized advertising campaigns run via Google, and the third detailed event-level information including genre, location, and time.
Cleaning and merging the data was the most time-intensive task. Ultimately, we engineered a dataset with one row per unique customerID-concertID-concertGenre-DateTime combination. We treated each customer’s history as a sequence of nodes, where each node represented a concert they attended. If a customer bought tickets to both a jazz concert and a country concert, that created an edge between the two corresponding nodes. Users who attended only a single concert didn’t contribute any edges to the network, as no co-attendance relationships could be inferred.
Hypothesis
We hypothesized that network topologies would differ meaningfully between large and small spenders. Specifically, we expected large spenders (>$600/year) to behave more like generalists—crossing genre boundaries frequently—while small spenders (≤$600/year) would show more genre loyalty, resulting in tightly clustered, genre-specific subnetworks.
Findings
Our network analysis confirmed this hypothesis. In the small spender graph, we observed distinct genre-based clusters, suggesting these users gravitate toward specific music preferences. In contrast, large spenders produced a much denser and more interconnected network, with fewer clearly defined genre-based groupings. This supports the notion that high spenders are less selective—likely motivated by resale value rather than personal taste.
These insights point toward a potential strategy for identifying “true fans”: users whose concert-attendance behavior forms tight genre-specific clusters are more likely to be genuine fans rather than resellers. TicketMaster could leverage this behavioral signal, among others, to inform fraud detection systems and prioritize access for actual fans.