Difference between revisions of "Assignments WS 2021/2022"
(→Sample size for a/b test) |
|||
Line 279: | Line 279: | ||
Python, monte-carlo | Python, monte-carlo | ||
Author - Vaso Dzhinchvelashvili | Author - Vaso Dzhinchvelashvili | ||
+ | |||
+ | : '''Approved''' [[User:Tomáš|Tomáš]] ([[User talk:Tomáš|talk]]) 02:24, 28 December 2021 (CET) | ||
== Scooter rental == | == Scooter rental == |
Revision as of 02:24, 28 December 2021
Please, put here your assignments. Do not forget to sign them. You can use ~~~~ (four tildas) for an automatic signature. Use Show preview in order to check the result before your final sumbition. |
Please, strive to formulate your assignment carefully. We expect an adequate effort to formulate the assignment as it is your semestral paper. Do not forget that your main goal is a research paper. It means your simulation model must generate the results that are specific, measurable and verifiable. Think twice how you will develop your model, which entities you will use, draw a model diagram, consider what you will measure. No sooner than when you have a good idea about the model, submit your assignment. And of course, read How to deal with the simulation assignment. |
Topics on gambling, cards, etc. are not welcome. |
In order to avoid possible confusion, please, check if you have added approved in bold somewhere in our comment under your submission. If there is no approved, it means the assignment was not approved yet. |
Contents
- 1 Spread of covid19 in closed/open area markets
- 2 Effects of COVID-19 vaccination on the spread of infection
- 3 Simulation of genetic algorithm: Travelling Salesman Problem
- 4 Optimizing the process of baking wedding sweets
- 5 Carsharing company fleet optimization
- 6 How long would it take to find love
- 7 Sample size for a/b test
- 8 Scooter rental
- 9 Is it possible to stop fake news?
- 10 Simulation 3D print
Spread of covid19 in closed/open area markets
In winter 2021 in Czechia the christmas markets were banned due to another covid19 infection wave. On the other hand people are free to go into shopping malls. It would be interesting to use existing data about covid19 virus transmission in agent based simulation to see how many people get infected and in what speed depending on whether they are in a christmas (open) market, or in a shopping mall (closed). The main goal will be to see if the simulation would backup the decision that has been made about christmas markets.
Possible research papers that contain data about covid spreading
- Kaggle notebook - Covid19, Evolution, Transmission, Spatial Patterns
- Research - Understanding COVID-19 transmission, health impacts and mitigation: timely social distancing is the key
This simulation would be realised using NetLogo.
Summary:
WHAT will be simulated
- market place, which can be both open space or closed space.
- people with or without masks, who will walk from shop to shop, with some intention and some of them will be virus carriers
- virus, which will spread in places where people go through (depending on the closed/open area, the infection rates will differ)
GOAL of the simulation
- answer the question: "Where is the virus spread more significant? At the market place, or at the shopping mall?"
TOOL used for the simulation
- NetLogo
- Agent based simulation
Author: Angel Kostov, xkosa20
Effects of COVID-19 vaccination on the spread of infection
Problem definition
Currently, there is a new wave of infection in the COVID-19 pandemic with high number of infections. In Germany, for example, more than 50,000 new infections are currently reported every day. To reduce the infection rate, a wide variety of measures have been implemented. One of these measures are the vaccination and masks. Vaccination can reduce the risk of infection and the likelihood of transmissibility. A simulation is conducted to vividly identify the extent to which vaccination could contain the pandemic.
Simulation
The purpose of the simulation is to show how COVID-19 vaccination affects the spread of the pandemic. I will use an agent-based model in order to simulate the scenario in a simplified form based on existing scientific data. In addition, current COVID-19 measures are considered.
Subjects of the simulation:
Environment:
- A village with around 6.000 inhabitants
- Simplified: The village is closed that is new people cannot come in and people of the village cannot go out
Agents:
- Vaccinated persons
- Unvaccinated persons
- Simulations with different vaccination rates to compare the different infection courses
Further measures:
- All people wear masks
- Nobody wears a mask
Start:
- Few agents are unknowingly infected (e.g. 0,1% of the inhabitants = 6 people)
- Movement of agents intended to reflect the daily behavior of people in real life in a simplified form.
- One agent can infect another agent with a certain probability if they are close to each other.
- The probability of an infection depends on the measure (vaccination / mask)
Goal:
- Identify the infection course with different vaccination rates and measures.
- Showing the importance of vaccination.
Method
- NetLogo
Possible data sources
https://www.rki.de/SharedDocs/FAQ/COVID-Impfen/FAQ_Liste_Wirksamkeit.html
Chia, P. Y., Xiang Ong, S. W., Chiew, C. J., Ang, L. W., Chavatte, J.-M., Mak, T.-M., Cui, L., Kalimuddin, S., Chia, W. N., Tan, C. W., Ann Chai, L. Y., Tan, S. Y., Zheng, S., Pin Lin, R. T., Wang, L., Leo, Y.-S., Lee, V. J., Lye, D. C., & Young, B. E. (2021). Virological and serological kinetics of SARS-CoV-2 Delta variant vaccine-breakthrough infections: A multi-center cohort study. Clinical Microbiology and Infection, S1198743X21006388. https://doi.org/10.1016/j.cmi.2021.11.010
Eyre, D. W., Taylor, D., Purver, M., Chapman, D., Fowler, T., Pouwels, K., Walker, A. S., & Peto, T. E. (2021). The impact of SARS-CoV-2 vaccination on Alpha and Delta variant transmission [Preprint]. Infectious Diseases (except HIV/AIDS). https://doi.org/10.1101/2021.09.28.21264260
Harder, T., Külper-Schiek, W., Reda, S., Treskova-Schwarzbach, M., Koch, J., Vygen-Bonnet, S., & Wichmann, O. (2021). Effectiveness of COVID-19 vaccines against SARS-CoV-2 infection with the Delta (B.1.617.2) variant: Second interim results of a living systematic review and meta-analysis, 1 January to 25 August 2021. Eurosurveillance, 26(41). https://doi.org/10.2807/1560-7917.ES.2021.26.41.2100920
Author: Laura Kundmueller
- OK, but, please, elaborate it a bit. How exactly should the simulation look like, kinds of agents, etc. And mainly: the sources of data, etc. Tomáš (talk) 12:00, 10 December 2021 (CET)
Simulation of genetic algorithm: Travelling Salesman Problem
Simulation
The topic of this simulation is an old graph problem, Travelling Salesman Problem. My approach would be based on genetic learning algorithm. A random map will be generated at the start. Salesman is travelling in a car with some gas. The gas is used as he travels, it can be recharged at gas stations but it costs money. The map contains some hills and flat roads, which have a different cost of gas when going through.
The goal is:
- to find the optimum path between the towns.
The parameters are:
- number of agents (travelling salesmen)
- gas in car
- money
- number of towns
- number of hills
- number of gas stations
Method
- NetLogo
Author: Mart13 (talk) 09:57, 9 December 2021 (CET)
- Although this isn't a true agent-based simulation, we sometimes accept topics from artificial intelligence and other related fields. However, it is necessary to elaborate it in deep. How exactly the algorithm will work. What is the goal (not the goal of the agent, but the goal of this work)? Etc. Tomáš (talk) 12:03, 10 December 2021 (CET)
Algorithm:
- Basic genetic learning, gene is a path of an agent. My idea is that every patch of map has a different cost of going through. The agents must decide where to go and in which order.
- Fitness function is a score of an agent (the score of its path).
- I would like to implement mutation and crossover.
Additional parameters to those introduced earlier:
- number of population
- number of mutations
- number of crossovers
Goal of work:
- graphical interpretation of AI learning
Tomáš Martínek (mart13) (talk) 11:49, 11 December 2021 (CET)
- Approved
Optimizing the process of baking wedding sweets
Simulation There is a wedding tradition in Czech Republic of baking wedding sweets and then handing them out to the guests of the weeding as a form of invitation. Process of baking usually takes whole day and several helpers in the kitchen are needed. Into paper baskets are usually packaged two types of sweets: several small ones with 3 different flavours and one so-called "rohový koláč". Which are then delivered by the bride to wedding guests. For the purpose of this simulation are process and needed ingredients simplified.
The goal is: The goal is to optimize the number of helpers in the kitchen and find optimal amount of basic ingredients for specified number of guests.
Method: Discrete simulation - SIMPROCESS
Entities:
- sweets
- paper baskets
- baking trays
Resources
- pastry-cooks
- bride
- flour
- sugar
- curd
- plum jam
- poppy seed filling
Process steps
- order for paper basket
- preparing sweets: small ones (3 different flavours), "rohove kolače" sweets (using all flavours)
- baking in the oven
- sugar coating
- packaging
- delivery
Data:
- https://www.svetsvateb.cz/2021/02/623262-svatebni-kolacky/
- https://megvkuchyni.cz/recepty/speciality/svatebni-special-jak-na-svatebni-kolacky/
- experience
Author: Michaela Červinková (cerm18) (talk) 10:16, 8 December 2021 (CET)
Carsharing company fleet optimization
Problem definition
Recently, carsharing becomes more and more popular in large cities. Short-term rental (from several minutes to 24 hours) of a car with possibility to drop it anywhere in the allowed area in the city attracts people who for some reasons do not want to use their own vehicles. However, it is not always convenient. If the fleet is relatively small, the probability that a car will be somewhere close by is also quite low. Cars also must be refueled or recharged sometimes by external staff, which would increase cost of the fleet maintenance with increasing of the fleet size.
Simulation
The proposing agent-based simulation will reproduce real situation with shared cars. Two types of agents are planned:
- cars with different states (waiting, in rent, maintenance) and characteristics (mileage, fuel level)
- drivers - users of the service, who rent the cars and have their own behavior, including decision making on taking a car, driving style, and so on.
Some data for the model will be obtained as personal observations of two carsharing services operating in Prague, Anytime and Uniqway (for example, number of available cars, which is visible in mobile applications). Another source of data would be statistics collected by other services abroad, for example, by operating in Russia service Yandex.Drive[1].
The goal is: to find out optimal fleet size and structure and price policy to maximize revenue of a carsharing company.
Method: agent-based simulation - NetLogo
Author: Sergei Shcherbinin (shcs00) (talk) 12:34, 8 December 2021 (CET)
- I like the idea. Please, do not hesitate to contact the companies directly for the data. They are sometimes willing to help if you require just general, non sensitive data. Approved. Tomáš (talk) 12:08, 10 December 2021 (CET)
How long would it take to find love
Problem definition
There is such thing as Drake formula used to calculate a chanse to meet an alien https://en.wikipedia.org/wiki/Drake_equation Also there is an article https://e-lub.net/annuals/why.htm - to sum up, it is Drakes formula used to calculate a chance to meet a love of your life in London
So What I want to do is to let user enter some parameters of what he/she thinks the perfect person is and based on the entered parameters we will see how long would it take to meet a perfect person in a city (say, Prague).
Simulation
Data about the city would be taken from official sources (might be Eurostat or some other data that is gathered by Czech statistical office) User will enter all the parameters and then see when will he meet a person. Of course for the sake of achievability some things will be simplified. For example I don't think I will actualy add millions of agents, Maybe hundreds would be enough.
The goal is: To see how year passes after year and love is still not found:) Or maybe to understand that lowering the expectations could correspond to finding a love faster
Method: agent-based simulation - NetLogo or AnyLogic
Author - Vaso Dzhinchvelashvili
- To be honest, althought it is a funny idea, the problem, in fact, is verification. You would deal with soft properties and attitudes, nothing what could be measured. Even the article wasn't apparently meant to be serious.
- Please, try finding something else. Tomáš (talk) 19:17, 17 December 2021 (CET)
UPD:
Sample size for a/b test
Problem definition
There are different approaches on how to calculate sample size for A/B test when measuring difference in means for treatment and control group: 1 Calculation through power of a test 2 Rule of thumb for power of a test: https://stats.stackexchange.com/questions/11131/sample-size-formula-for-an-f-test 3 some other formula https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2876926/ 4 Formula derived from confidence interval https://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_confidence_intervals/bs704_confidence_intervals5.html
And I think there should be more. They give different results
Another problem is in type 1, type 2 error: researcher cannot always detect the difference or can detect the difference where there is no difference. More importantly, in formulas the type 2 error is not used at all, whereas obviously it should be included in the models somehow, maybe as one of the axioms... so the goal of simmulation is to illustrate the relation of type 1, type 2 errors, sample size, difference of means and the fact of rejecting/accepting null hypothesis. It will be a tool, which gives the researcher the answer 'how many observations do you need to detect the difference in a test'.
Simulation
Simulate an A/B test: set distribution parameters for control and treatment groups, for different sample sizes. Calculate statistics (a;pha,betta, p-value) for the tests Compare number of observations to the ones derived from formulas - this way the correct one could be picked.
Illustrate real chances of type 1 and type 2 errors. Illustrate how enhancing the difference in means leads (or doesn't) to decrease of needed sample size/ rise of significance.
The goal is: Pick the best method for calculation of sample size. Have a clear understanding how to design a/b test, based on some inputs (expectations)
Method: Python, monte-carlo Author - Vaso Dzhinchvelashvili
Scooter rental
Problem definition
Today in almost each big city you can find a huge number of scooter which can be rented. The main feature of scooter rental is that you can take it at one point in the city and return it to another when you finish your route. If there is no scooter nearby, then obviously a potential client will not take it. In addition, people prefer to ride with friends, so the likelihood of renting a scooter is higher if there are several of them at one point. The more scooters there are in the city, the more chances that people will use this service, however, scooter support also requires additional resources and extra costs.
Simulation
The goal of agent-based simulation is to define optimal number of scooters for specific city and its destribution within the city to maximize income of the company. Two types of agents will be used:
- scooters with a list of states (rented, broken, available, available + near other scooters) and characteristics such as, for example, charge level.
- clients with different characteristics (for example, alone, with friend, maximal suitable distance to a scooter)
Method:
Agent-based simulation in NetLogo
Author: Liudmila Kalashnikova (liudmila_kalashnikova)
Approved Tomáš (talk) 00:24, 17 December 2021 (CET)
Is it possible to stop fake news?
Fake news is a global problem and it is getting worse every year. Is it possible to stop fake news? Fake news are made by people named "trolls". These trolls make fake news and send them to people. On the other side of the table are elfs, who defend people from fake news and send "fake news warnings". When people receive fake news, they have basicaly three choices: resend it as it is (basicaly become troll), resend it with "fake news warning" (become elf) or dont resend it. Can elfs beat trolls and defend the majority of system from fake news?
Possible parameters of simulation
- number of trolls at the start
- possibility to become a troll (the rest to 1 is posibility to become an elf)
- troll's number of resends
- elf's number of resends
- troll's possibility to resend
- elf's possibility to resend
- and possibly many more which I cant now think about.
Additional idea:
Elfs can change to Trolls (and vice versa) in time. It would depend on how many information from who did they receive. For example, if a troll gets after some time more warnings from elfs than fake news from trolls, he become an elf too.
Articles about fake news spread
- https://www.nytimes.com/2021/11/22/world/europe/belarus-migrants-facebook-fake-news.html
- https://semantic-visions.com/resource/defending-the-covid-19-vaccination-pipeline
- https://www.cits.ucsb.edu/fake-news/spread
- https://thenextweb.com/news/ai-isnt-going-to-stop-fake-news-syndication
- https://www.bbc.co.uk/bitesize/articles/z6kxxyc
- https://ischool.syr.edu/fake-news-why-people-believe-how-it-spreads-and-what-you-can-do-about-it/
- https://futurism.com/fake-news-study-spread
Goals of simulation are answers to these questions:
- How many trolls do you need to get to the majority of people and how many resends they should perform (in average)?
- How many elfs do we need if there is X number of trolls? And how many resends should elfs do?
- Where are the boundaries to these numbers?
TOOL used for the simulation
- NetLogo
- Agent based simulation
Author: Tomáš Martínek (mart13) (talk) 00:42, 13 December 2021 (CET) Approved Tomáš (talk) 00:22, 17 December 2021 (CET)
Simulation 3D print
The simulation would focus on printing a 3D model on a 3D printer. This model would be made of multiple materials, so a 3D printer would need a material storage and a switch between them. Each material needed a different temperature and time to print parts of the model, also each material has a different financial value. At the same time, there is a need to consider providing additional cooling that slows down the print to prevent it from warping. There is also a need to consider the influence of the environment, such as ambient temperature, wind or nozzle clogging.
Measurement parameters
- time of making
- production cost
- nozzle temperature
- time between transitions
- condition of materials
- etc
Goals
- Measure how long it takes for the model to print.
- What is its financial demands if we take into account the price of the material and the time of its printing.
- If additional cooling needs to be switched on.
Tool
- Simprocess or vensim - advice?
Author: Ondřej Pišl
- This is good example for a calculation, but not for a simulation. In case of System Dynamics it should focus on one parameter and its dynamics - for example material consumption. What all parameters it is influenced by, including complexity of the printed model and also probability should also come in (random variables) - print failure, model collapse (and its relation to model complexity). The goal of the simulation then could be a simulation, that will tell the user most probable material consumption for the input given by the user. Modify the proposal in this way and then the Vensim could be the viable way. Oleg.Svatos (talk) 22:05, 19 December 2021 (CET)