Lessons Learned

Regarding SEAGLASS Methodology

For each stage of the project:

1. Formalization of alliances and initial setup tests

    1. Although the call to participate as a Local Coordinating Organization was quite successful, there was a latent commitment of the organizations then involved in keeping the project out of reach of each country’s media outlets as long as the development of the FADe project was carried out.
    2. During the project’s development, purchasing feature phones became a quite complicated activity, given the limited stock in the market of the specific model that the methodology considers. This could also be a limitation for other potential groups developing the SEAGLASS methodology.
    3. Purchasing the T191 communication cables became quite a complicated task since shipping and reception could take a couple of months. That’s why our team decided to buy the parts and build them by ourselves. This required a certain skill level and knowledge of electronics. Click here
    4. After some experimenting, we learned that it is essential to be careful about how the Serial – USB cables are assembled. In some tests, the exhaustive physical use affected its correct functioning. These faults were corrected at the end. Also, it isn’t easy to get the USB to serial converter in most Latin American markets. We suggest visiting retail and international suppliers to obtain the project’s necessary stock.

 

2. Install and start of field tests

  1. Although in the end, the written guide to reproducing the setup and the template of activities to be developed during the training is the same for each of the local organizations. It was necessary to adapt the contents depending on each particular case’s nature regarding local geography, lexicon, and technical level.
  2. Initially, we tested three different versions of the SEAGLASS app (0.7.1 to 0.7.3) because of bugs with the cellular data collection, making it challenging to prepare the training materials efficiently. Distributing apps through Google Play Store can affect how the app is served.
  3. Although the latest version of the SeaGlass application had an interface with a comfortable user experience, it still takes a while to manage and understand all its features.
  4. The current UX of the SEAGLASS app results confusing for some local operators regarding redaction and terminology. We recommend doing a USX assessment to detect potential improvements to the app.
  5. Each local organization should develop its threat model. It is highly recommended that each organization be treated individually.
  6. It turned out to be necessary to assign a number of fixed and backup sensors during the project’s development, depending on each city’s dimensions and local characteristics.
  7. Given the selected cities’ context, variables such as power outages, difficulties in car transit, organizations closing their offices, among others, must be considered to build a backup plan.
  8. As a specific city or activist collective reality changes, we needed to adapt the measurement plan dynamically with the Local Coordinating Organization.
  9. To deploy the SEAGLASS app in the sensor phones, we decided that every phone must have its own Google account not related to the rest of the sensors for security reasons. This brings some logistic problems regarding the activation of new accounts, giving that we need to link each new account with a valid phone number. Thus we cannot activate more than two (2) new accounts with the same number.
  10. It is highly recommended to test the compatibility between the Cable Serial – USB and the feature phones right before deploying the whole sensor. In one of the cases, the Local Coordinating Organization had found the feature phones locally, even from the same provider we used for other cities tested. However, once we got there, our Cable Serial – USB did not work well with those phones, so we had to stay longer than planned and build again the Serial – USB cables, which fortunately ended up working correctly.
  11. One of the cities had issues with the deployment of the sensors. Specifically, a group of old devices (or feature phones) had the bootloader blocked, making it impossible to install custom firmware. This had to be solved along with the SEAGLASS developer team, generating an environment that would allow this equipment to be reflashed to install an operating system without restrictions on the bootloader.
  12. The SEAGLASS sensor set up with the custom cable to connect the charging and data transmission has been presenting functioning issues. This might be due to the smartphone electric demand and physical fragility of the soldered wires, making its use more convenient in terms of steps to run the sensor but more prone to errors requiring cable substitution or further troubleshooting.
  13. Related partners suggested that the SEAGLASS sensor setup is cumbersome to operate for some and could be improved by migrating its architecture to a raspberry Pi approach, in which we need to turn on the device for it to start functioning. Another suggestion is proposing a strategy using only a smartphone.
  14. We considered an opportunity to develop future analysis on this network infrastructure that can serve as an input to propose best practices and regulations to ensure more harmonic network configurations. Security and privacy baselines can benefit the users and researchers, making it easier to detect anomalies that can point to illegal surveillance.
  15. We considered as a limitation that the SEAGLASS methodology had been designed only for 2G/3G protocols. We proposed researching ways to port their technology to 4G/LTE and establish the bases for 5G developments. This can also include using different sensor parts, also addressing concerns regarding difficulties to find discontinued equipment.
  16. Many of the smartphones purchased to build the sensors and execute the project were the same brand and model. These phones  had an incident regarding a software update that disabled their network capabilities, so we had to inform the local partners not to update those phones until a solution or workaround is found.
  17. A critical dependency used by the Seaglass side data analysis pipeline (Wireshark) had a bug in a rarely used functionality that delayed the process while solving. The Seaglass team already took the corresponding corrective actions, and the affected part of the analysis pipeline kept running smoothly.

    3. Exclusive testing period

    1. Batteries and spare serial -USB cables must be given to local teams to supply other equipment in case of failure.
    2. Some of the serial-USB cables taken from one city of interest to another (with a specific USB-to-serial adapter model) were damaged during the travel. It is believed that the adapter is sensitive to the x-rays used in airports.
    3. Some of the local partners asked not to receive any detailed location status reports for security concerns; responding to that, we decided to review and change the way reports are built to address the risk of sharing exact location information.
    4. The reading of the data was extraordinarily complex and demanding at a computational level. It is suggested to develop a detailed and progressive dashboard development plan, in which not only the location of the sensors is monitored in real-time to check that the data is collected, but also the geographic zones, so that the information be able to conclude from both cell towers and potential IMSI-Catchers.
    5. The analysis to determine some of the information requires medium and advanced knowledge of mobile technology to interpret specific values ​​that are not currently referenced in basic manuals.
    6. There were many things regarding carrier technical implementations that we learned during this project, especially that there is a considerable disparity in how the networks are implemented, even in the same country.
    7. Batteries of the feature phones don’t seem to last. Looking deeper into this issue in the cities under study, we noticed a pattern in which the monitored places’ altitude was related to the time the feature phones ran out of power. The highest city resulted in battery life in the feature phones for about 20% compared to the towns closer to the sea level. After putting the case under research, we found some interesting variables that might be affecting the behavior of these batteries:
      1. The altitude itself does not directly affect the batteries’ duration since they should be sealed and the pressure should be constant.
      2. The batteries’ age since their manufacture could affect their duration, decaying over the years and being quite more sensible to other variables. Since this phone model was announced in late 2005, we suppose many of these batteries were manufactured at least ten years ago.
      3. The number of base stations might also affect when the phone is looking for a mobile signal and discharge time. However, given that the feature phones are always doing the same task of finding new base stations, this doesn’t explain the difference in time to discharge among the different monitored cities.
      4. The primary variable that might affect the discharge time of the batteries is extreme temperatures. In colder climates, the batteries can cut their duration by a considerable proportion. Some sources even say that by every 5 degrees below 20 Celsius degrees, the battery’s time to discharge is cut in half.
      5. For this scenario, we highly recommend using the sensors in a context within acceptable temperatures while being used (20 to 35 Celsius degrees). This could be done by having the sensor inside vehicles with temperature under control or close to the operator’s body if used walking, biking, etc.

    4. Data processing

    1. It has been incredibly complex in terms of time the massive analysis of measurements. For future projects, we suggest require a method designed exclusively to achieve more efficient data processing.
    2. – Given the significant amount of data gathered by the sensors, we have concluded that there are still many potential tests that can be developed and improved to detect network anomalies. E.g., Location area code analysis, signal strength analysis for specific cells, tower distance estimation, comparison with known values.
    3. We were going through several challenges regarding analyzing the results when processing a large amount of data in this project’s framework. Accurate decisions were required in terms of computational efficiency and capacity of the infrastructure used. At this point in the analysis, the SEAGLASS team takes data samples. It tests different analysis methodologies, such as basic workflows proposed by the same group or training machine learning models to evaluate their effectiveness by independently managing the data.
    4. During the initial steps of analyzing results, an error in a popular traffic analysis library (Wireshark) had been found, preventing the effective processing of telephone communications packages. Since the analysis of this type of information is less common, the error was not seen until that moment.
    5. When analyzing the data corresponding to different environments as to the places where the methodology was initially tested, it was noticed that there are multiple disparities in the path in which cellular antennas are configured. This 1) makes it even more challenging to understand the baseline to detect anomalies, and 2) opens a massive number of possibilities to find other types of anomalies that may or may not lead to the detection of IMSI-Catchers. Such distinctive readings were collected, among other things, to the characteristics of the telecommunications infrastructures themselves, given that in some cases, these facilities had exceeded several decades since installed.
    6. During the work session at the UW, it was perceived that there are multiple ways to detect anomalies from the cell phone data collected, many of which need time to be validated before their integration into workflows and public results reports. That is why we decided that the results to be published will be dynamic, and new cases would be refined and added as they are analyzed and validated. We believe that, by doing so, the work schedule already established will be affected as little as possible.
    7. The server where the data is centralized belongs to the University of Washington, making it difficult to increase the volume of data (and with this, the number of places to monitor) sent to them because we depend on their staff preparing batches of data migrations. This made slow the transmission of big database updates. We recommended opening the server’s source and adding to the app an option to configure a self-hosted SEAGLASS data reception server.
    8. The difference in the data collected between the SEAGLASS and Crocodile Hunter methodologies is notable. More than 300 fields of network parameters are collected in the first one, while in the second only 19. This is probably because Crocodile Hunter collects less information and performs several internal processing to calculate anomalies in an automated way during monitoring.
    9. Since we needed to reassign sensors to new operators and cities, we had to regenerate new “UUID” identification codes on the SEAGLASS application to be used to facilitate the future separation of the data collected by this equipment.

    5. Promotion of methodology and results

    1. It was quite challenging to define what technologies to use to build the website so that there were no incompatibilities with the data resulting from the analysis of the information collected. To address this, it was decided to implement tools that offer the flexibility necessary to include arbitrary code within the pages that make up the site.
    2. During the project, one of our primary goals was to translate the research’s technical subjects into more straightforward language for non-technical audiences, building the whole documentation considering the premise that the users might not be familiar with the basics of mobile phone communications. We realized the necessity for more information at hand to the direct actors involved.
    3. Based on a large number of questions, concerns, and interest of the audience, this topic seems to be fascinating not only for academic investigators but also for journalists, media outlets, activists, and free speech NGOs, given the approach to use this methodology as a tool to advocate against illegal use of monitoring practices.

    Regarding Crocodile Hunter Methodology

    For each stage of the project:

     

    1. Formalization of alliances and initial setup tests

      1. Although the call to participate as a local coordinating organization was quite successful, there was a latent commitment of the organizations involved in keeping the project out of reach of each country’s media outlets as long as the FADe project’s development was carried out.
      2. Since we were reusing the SEAGLASS sensors in this phase of the project, as complementary to the Crocodile Hunter sensors, it became necessary to make some adjustments in the smartphone’s configuration. removing all the historical information that could compromise in any form the identity of the old operators.
      3. Given that the crocodile hunter code was being updated on a rolling basis, and the Raspberry Pis acquired by the project work in the border of their capabilities, we considered preventive measures to overclock the Raspberry Pi 4s and propel a backup plan of substitution with laptops. In the worst-case scenario, this is in which, at some point, the Raspberry Pis wouldn’t be capable of managing the tasks of the crocodile hunter software.

     

    2. Install and start of field tests

    1. The crocodile hunter setup and software had changed over time, forcing the project to prepare for potential redesigns of the scope, needed equipment, and local capacity requirements. We included in our communication loop with LCO’s the necessary knowledge to address these possible changes, which wouldn’t affect the delivery of the proposed products.
    2. This methodology was not friendly with the use case of the operators working on the measurements. This is mainly because it requires an active internet connection, which could be challenging to ensure when one cannot configure the sensor on a portable setup without technical skills. (wifi hotspots, a previous configuration of the sensor hotspot, and potentially SSH connection). We recommended future iterations of the software, including an offline mode, to capture raw data without the internet. Later at the operator’s home, the sensor can be powered on and connected to the internet with a cable, making the checks and tests that require connectivity.

     

    3. Exclusive testing period

    1. We experienced some degree of flexibilization regarding confinement measures (Covid-19 era), making it easier to build routes across the cities under study. We adapted our strategy to the quarantine measures and maintained all the sanitary safety controls, primarily relying as much as possible upon particular vehicles to make the monitoring.
    2. We found that the deployment of Crocodile Hunter sensors (Covering 4G/LTE) was considerably more accessible in a range of setups than the previously used SEAGLASS sensors (Covering 2G/3G). For instance, on vehicles, portable bags, and fixed in static places like buildings.
    3. We experienced some failures while testing new sensors. After research, we saw that the GPS receivers used in the project have a deficient sensibility indoors, even inside cars through the windshield. Hence, we made an internal patch in the install process to set up the sensors without using the GPS receiver and modified the sensors’ positioning to face the open sky when operating.

     

    4.Data processing and report generation

    1. Crocodile Hunter sensors’ need to have internet access during measurement proved problematic in some cases where operators did not have the technical skill to connect the sensors to wireless or wired network points. This led to problems with the data and checking the antennas in public access services such as OpenCellID. The analysis was subsequently done manually with this service and with Google Geolocation API.
    2. The difference in the data collected between the SEAGLASS and Crocodile Hunter methodologies is notable. More than 300 fields of network parameters are collected in the first one, while in the second only 19. This is probably because Crocodile Hunter collects less information and performs several internal processing to calculate anomalies in an automated way during monitoring.
    3. We collected considerably fewer data than earlier during previous monitoring. According to our follow-up, the reasons could be:
      1.  a reduced measurement timeframe -two months vs. three months in the cities’ initial batch.
      2. The lockdown measures due to COVID-19, recently implemented by the governments that blocked us from doing the regular rounds across the city.
      3. The possibilities of the local partners, that given the current context, were capable of hosting the sensors in static strategic places without the case of movement.
    4. Some critical data gathered and calculated by the sensor doesn’t have physical consistency, making it difficult to use the analysis data. E.g., for instance, the estimated distance value is inconsistent at the point of being disregarded on our analysis, being a critical value to have in case it is reliable.
    5. The Crocodile Hunter sensor stores a large number of parameters in a raw and unified form (All the System Information Block 1 from the connection with the phone tower), making it difficult the use this data to analyze it. We proposed future iterations helping EFF develop the necessary code to parse this raw data into the dozens of parameters that can considerably elevate the analysis’s quality.

     

    5. Promotion of methodology and results

    1. It was quite challenging to define what technologies to use to build the website so that there were no incompatibilities with the data resulting from the analysis of the information collected. To address this, it was decided to implement tools that offer the flexibility necessary to include arbitrary code within the pages that make up the site.
    2. During the project, one of our primary goals was to translate the research’s technical subjects into more straightforward language for non-technical audiences, building the whole documentation considering the premise that the users might not be familiar with the basics of mobile phone communications. We realized the necessity for more information at hand to the direct actors involved.
    3. Based on a large number of questions, concerns, and interest of the audience, this topic seems to be fascinating not only for academic investigators but also for journalists, media outlets, activists, and free speech NGOs, given the approach to use this methodology as a tool to advocate against illegal use of monitoring practices.

    FADe project is an initiative of South Lighthouse with the support of the Open Technology Fund.

     

    This website is available under a Creative Commons Attribution 4.0 International (CC BY 4.0) License creativecommons.org