Xsede13-prog-contest

From Earlham Cluster Department

Revision as of 18:39, 18 July 2013 by Amweeden06 (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Problem Descriptions

1. Area under the curve

Scientists need to calculate the area under a curve in many real-world applications, from determining the acceleration of an object in physics to modeling the change in concentration of a drug in pharmacokinetics. Calculus-based approaches work for calculating the area under simple curves, but for more complex curves, the best method is often to apply a discrete approximation using simple shapes and limits.

2. Prime number generation

Because prime numbers have a random (as far as we know) distribution across the number line, generating them can be a very useful task for introducing randomness to a scientific model. A famous scientific application of prime numbers is in cryptography, particularly in the RSA encryption algorithm, which uses pairs of prime numbers as keys to an encoded message. Generating prime numbers that are large enough to be used as keys is a computationally-intensive task.

3. Cellular automata

A number of interesting scientific models use cellular automata to represent real-world situations. A cellular automaton is a grid of cells, each of which has a state that is often determined by the states of its neighbor cells. A famous example of a cellular automaton is Conway’s Game of Life, but there are many examples elsewhere in science and math.

4. N-body simulation

In physics, the 2-body problem explores how two physical bodies with certain masses interact with each other through gravitational force. The 2-body problem has been solved, but extending the problem to a 3-body problem by adding just 1 more object makes the problem so chaotic that it has only been solved for special cases. In general, if scientists want to study the n-body problem, in which an arbitrary number of bodies interact with each other, they will often use computer models and simulations.

5. Binary tree traversal

In computer science, trees are a common data structure for storing and processing data. A tree is composed of nodes, each of which point to one or more child nodes. There is one node which rules them all, the root node. This is the first node in the tree, and the only node to which other nodes do not point. In addition to pointing to other nodes, each node will have a value (the meaning of which is dependent on the application). Finally, a tree has no loops (or cycles); that is, processing a tree from top to bottom will visit each node either 0 or 1 times.

A tree can be traversed using several different methods. The two most applicable here are in-order depth-first (visit all left children, then the current node, then all right children); and breadth-first (search each level left-to-right before moving to the next level down).

A binary search tree is a tree with special properties. First, each node will point to only 0, 1, or 2 other nodes. Second, the tree is structured such that the value of all of each node’s left children have values (whatever that may be) less than the node itself; and the value of all of each node’s right children have values greater than the node itself. Finally, each value in the tree must be unique.

6. Agent-based simulation

An agent-based simulation involves many individual things (“agents”) moving around in an environment, and is useful in many areas of science including biology (spread of disease), chemistry (motion of particles in an ideal gas), and physics (n-body gravitation).

7. Monte Carlo

Monte Carlo simulations are useful where a model does not have a deterministic result. Instead, the model is run many times (this is sometimes called an ensemble) and the results in aggregate are used to produce a probability distribution of expected outcomes.

8. Logistics

The problem comes from elementary math education, particularly fraction addition.

9. Simulated Annealing

One of the most important roles mathematics plays in the world is that of reducing complex problems to simple algorithms that either someone of a computer could follow to a desired solution. In many, many real world applications, the object is to find the least expensive, or the most environmentally safe, or the fastest solution from a huge number of possible solutions. There are certain optimization problems that become unmanageable using combinatorial methods as number of objects becomes large. For example, finding an optimal solution to the ‘Traveling Salesman Problem (TSP)’ optimization problem could be an incredibly difficult task, often practically impossible. Because when you increase the number of cities, you need to search through an enormous number of possible solutions in order to find the optimal one. In this case because we can't realistically expect to find the optimal one within a sensible length of time, we have to settle for something that is close enough. Simulated Annealing algorithm is used for finding the most optimal solution to a probabilistic and combinatorial problems. The simulated annealing algorithm was originally inspired from the process of annealing in metals. Annealing involves heating and cooling a material to alter its physical properties due to the changes in its internal structure. As the metal cools its new structure becomes fixed, consequently causing the metal to retain its newly obtained properties. In simulated annealing we keep a temperature variable to simulate this heating process. We initially set it high and then allow it to slowly 'cool' as the algorithm runs. While this temperature variable is high the algorithm will be allowed, with more frequency, to accept solutions that are worse than our current solution. This gives the algorithm the ability to jump out of any local optimums it finds itself in early on in execution. As the temperature is reduced so is the chance of accepting worse solutions, therefore allowing the algorithm to gradually focus in on an area of the search space in which hopefully, a close to optimum solution can be found. This gradual 'cooling' process is what makes the simulated annealing algorithm remarkably effective at finding a close to optimum solution when dealing with large problems which contain numerous local optimums.

10. Computational Linguistics

The NPR Sunday Puzzles are a rich collection of problems with anagrams and combinations of letters. Tools for genome sequencing are designed to work with the letters A, C, G, and T, but can be used for any letters. PERL is specially designed for these tasks, but any language will suffice. The question will be an expansion of one of the Sunday Puzzles.

Personal tools
Namespaces
Variants
Actions
websites
wiki
this semester
Toolbox