Currently there may be errors shown on top of a page, because of a missing Wiki update (PHP version and extension DPL3).
Navigation
Topics Help • Register • News • History • How to • Sequences statistics • Template prototypes

Difference between revisions of "Introductory stuff"

From Prime-Wiki
Jump to: navigation, search
m
m
Line 3: Line 3:
  
 
==Does GIMPS have a bandwidth problem?==
 
==Does GIMPS have a bandwidth problem?==
A key issue for any [[Distributed computing project]] is the actual distribution of data. That is, how much time is spent uploading answers or downloading new [[DC Work Units|work units]]. GIMPS, by its nature, can communicate weeks of work with only perhaps a dozen bytes of information. In fact, it is possible to "check out" exponents by e-mail. Bandwidth is not likely to be a limiting factor for GIMPS.
+
A key issue for any [[Distributed computing project]] is the actual distribution of data. That is, how much time is spent uploading answers or downloading new [[DC work unit|work units]]. GIMPS, by its nature, can communicate weeks of work with only perhaps a dozen bytes of information. In fact, it is possible to "check out" exponents by e-mail. Bandwidth is not likely to be a limiting factor for GIMPS.
  
 
==Has anyone cheated and done any hacking of the GIMPS?==
 
==Has anyone cheated and done any hacking of the GIMPS?==
Line 26: Line 26:
  
 
==What is "caching" or "queuing"?==
 
==What is "caching" or "queuing"?==
Amongst diehard DC fanatics, '''caching''' means some amount of added [[DC Work Units|work units]] are downloaded to your computer, and will be started when the current work finishes. Prime95 calls this by another traditional name "queuing". This also means that work will continue even if [[PrimeNet]] is unavailable, even unavailable for many days. One of the settings is "how many days of work to queue up." Since checking out an exponent reserves it for a minimum of sixty days, this value can be set fairly high. The default is twenty. Experience will reveal if you should raise or lower this value (which will vary based on the kinds of tests you choose to run). [[Lucas-Lehmer test]]ing doesn't require more than one or two extra candidates queued up. Trial Factoring, since it may finish quicker at times, probably would do well with four or five.
+
Amongst diehard DC fanatics, '''caching''' means some amount of added [[DC work unit|work units]] are downloaded to your computer, and will be started when the current work finishes. Prime95 calls this by another traditional name "queuing". This also means that work will continue even if [[PrimeNet]] is unavailable, even unavailable for many days. One of the settings is "how many days of work to queue up." Since checking out an exponent reserves it for a minimum of sixty days, this value can be set fairly high. The default is twenty. Experience will reveal if you should raise or lower this value (which will vary based on the kinds of tests you choose to run). [[Lucas-Lehmer test]]ing doesn't require more than one or two extra candidates queued up. Trial Factoring, since it may finish quicker at times, probably would do well with four or five.
  
 
==What is a "WU" or "work unit"?==
 
==What is a "WU" or "work unit"?==
Line 43: Line 43:
  
 
==What is the lifetime of a work unit and why does it matter?==
 
==What is the lifetime of a work unit and why does it matter?==
At present, the default lifetime of a [[DC Work Units|work unit]] is sixty days. At this time, the work unit (whether [[Trial factoring]] or [[Lucas-Lehmer test]]) is subject to being "unreserved" and handed out to someone else. You can, however, extend this deadline if you wish. The reason this matters is, if you are late, you don't really want someone else to start work if you are going to finish in sixty five days after all. Since reporting progress is, generally speaking, optional, it is quite possible for a candidate to be checked out for Lucas-Lehmer and see no update of its status until the work is done. Thus, keeping your work status up to date is helpful to everyone. This is why the [[Prime95]] code tries to report status once in a while.
+
At present, the default lifetime of a [[DC work unit|work unit]] is sixty days. At this time, the work unit (whether [[Trial factoring]] or [[Lucas-Lehmer test]]) is subject to being "unreserved" and handed out to someone else. You can, however, extend this deadline if you wish. The reason this matters is, if you are late, you don't really want someone else to start work if you are going to finish in sixty five days after all. Since reporting progress is, generally speaking, optional, it is quite possible for a candidate to be checked out for Lucas-Lehmer and see no update of its status until the work is done. Thus, keeping your work status up to date is helpful to everyone. This is why the [[Prime95]] code tries to report status once in a while.
  
 
Also, once work units (especially Lucas-Lehmer candidates) are returned, many users like to grab them. By the ordinary nature of mathematics and the project, Mersenne candidates get larger and larger. Older exponents that are "not done" are obviously smaller. Since many participants are motivated not by raw stats, but by actually obtaining an actual [[Mersenne prime]], the smaller the exponent, the better. Thus, broadly speaking, you'll never have a better chance of finding an actual prime than whatever it is you have queued up already.
 
Also, once work units (especially Lucas-Lehmer candidates) are returned, many users like to grab them. By the ordinary nature of mathematics and the project, Mersenne candidates get larger and larger. Older exponents that are "not done" are obviously smaller. Since many participants are motivated not by raw stats, but by actually obtaining an actual [[Mersenne prime]], the smaller the exponent, the better. Thus, broadly speaking, you'll never have a better chance of finding an actual prime than whatever it is you have queued up already.
Line 55: Line 55:
 
#Trial factoring is a relatively short computation whose goal is to eliminate potential Mersenne candidates in a couple of days instead of a couple of weeks. Many Mersenne candidates are multiples of relatively small primes. Trial factoring discovers if there is such a number. If it is, the more expensive Lucas-Lehmer test isn't needed. At this writing, computers between 300 MHz and 1 GHz are probably the best to deploy in Trial factoring.
 
#Trial factoring is a relatively short computation whose goal is to eliminate potential Mersenne candidates in a couple of days instead of a couple of weeks. Many Mersenne candidates are multiples of relatively small primes. Trial factoring discovers if there is such a number. If it is, the more expensive Lucas-Lehmer test isn't needed. At this writing, computers between 300 MHz and 1 GHz are probably the best to deploy in Trial factoring.
 
#The Lucas-Lehmer test definitively establishes whether a given Mersenne candidate is a prime or not. At this writing, a candidate typically runs for weeks on a fairly fast PC (say, of about a GHz or faster, especially Pentium IVs, since the assembler code exploits SSE2 instructions). Most candidates fail, but if you want the glory of being a "finder" you have to run this test.
 
#The Lucas-Lehmer test definitively establishes whether a given Mersenne candidate is a prime or not. At this writing, a candidate typically runs for weeks on a fairly fast PC (say, of about a GHz or faster, especially Pentium IVs, since the assembler code exploits SSE2 instructions). Most candidates fail, but if you want the glory of being a "finder" you have to run this test.
#10 000 000 million digit Lucas-Lehmer primality tests are basically the same as standard primality test, but test the primality of numbers with more than 10 000 000 digits. These work-units take a very long time to complete. For a PIV at 2.4GHz you should expect more than 40 days before it completes. The [[Electronic frontier foundation]] is offering a [[EFF prizes|$100 000 awarded]] to the first person or group to discover a ten million digit prime number. If you find such a prime with the software provided, GIMPS will claim the award and distribute the award according to the rules published on the GIMPS.
+
#10 000 000 million digit Lucas-Lehmer primality tests are basically the same as standard primality test, but test the primality of numbers with more than 10 000 000 digits. These work-units take a very long time to complete. For a PIV at 2.4GHz you should expect more than 40 days before it completes. The [[Electronic Frontier Foundation]] is offering a [[EFF prizes|$100 000 awarded]] to the first person or group to discover a ten million digit prime number. If you find such a prime with the software provided, GIMPS will claim the award and distribute the award according to the rules published on the GIMPS.
 
#Double checks. A double check is overwhelmingly likely to be a rejected candidate. However, the computations involved are so extensive, the GIMPS project runs even rejected candidates a second time just to be sure. Moreover, until the double check agrees with the original Lucas-Lehmer test, credit in the [[GIMPS statistics]] for the former is provisional. Any machine can be deployed on this test, though old, slow machines can take months to complete their work.
 
#Double checks. A double check is overwhelmingly likely to be a rejected candidate. However, the computations involved are so extensive, the GIMPS project runs even rejected candidates a second time just to be sure. Moreover, until the double check agrees with the original Lucas-Lehmer test, credit in the [[GIMPS statistics]] for the former is provisional. Any machine can be deployed on this test, though old, slow machines can take months to complete their work.
  

Revision as of 11:57, 12 February 2019

Are there any add-on programs like SETIQ?

GIMPS does not really require such programs. You can get status at any time and the program queues (caches) as much work as you like. This takes care of most of what add-ons do and it is in the basic program.

Does GIMPS have a bandwidth problem?

A key issue for any Distributed computing project is the actual distribution of data. That is, how much time is spent uploading answers or downloading new work units. GIMPS, by its nature, can communicate weeks of work with only perhaps a dozen bytes of information. In fact, it is possible to "check out" exponents by e-mail. Bandwidth is not likely to be a limiting factor for GIMPS.

Has anyone cheated and done any hacking of the GIMPS?

It is feasible, but unlikely. A positive claim, that of a new Mersenne prime, is subject to double and triple checks by others, including using non-Intel code that doesn't even use PrimeNet. Moreover, new Mersenne primes are rare and anyone can check new ones out at any time. Therefore, such a claim is very unlikely to survive scrutiny. That leaves fibbing about the other tests ("I didn´t find one") in order to boost one's statistical standings by doing less than a full crunch. Here, one needs a "co-conspirator" to also fib and on the same exponent. This is possible, but not very likely. GIMPS, as a project, has a fairly complex statistical scheme and will thereby be less likely to attract those who hunts "big statistics" without actually earning them. Moreover, it always is possible for anyone so minded to do a triple check, which means anyone with unusually high production can be caught out at any time. Triple checks have even taken place by accident, since the use of PrimeNet is not mandatory and those getting candidates by hand can try them out.

How can I get a new wu when I'm away from my computer?

One of the premier features of GIMPS is that it doesn't need to connect very often to the Internet, the default is actually every 28 days. If you take the recommended work type (match your computer's speed to the work suggested in the What kinds of Work Units are there? question), your computer will work several days to several weeks on a given problem. In addition, Prime95 can be set to deal with intermittent Internet connections (e.g. via dialup) to ensure new work is available if the current work finishes. As you gain experience, you can be very sure you have weeks of work waiting so that, in principle, you would only have to connect to the network very rarely (perhaps monthly or even less often).

In addition, you can even move work to machines that are not even connected to the Internet at all. This is called "sneakernetting". A simple procedure for performing it described in the readme.txt. You can also move work that has been started on one machine to a different machine, though that takes a little more work.

How to do nonet and sneakernet?

The answer is related to caching. The readme.txt file included in Prime95 describes much of how this is done. Broadly, one can move individual lines in the file worktodo.txt from one machine to another (whether it is Windows or Linux). This works if no work has been performed against that line in the file. A typical Lucas-Lehmer test line looks like this:

Test=13974239,65,1

while a typical trial factoring line looks like this:

Factor=20110747,63,64

The only really important thing to know is that if it is any line other than the first line, it would ordinarily not be undergoing process. To be sure, list the directory and look for files with names like this:

pD974239 qD974239

which shows the check pointing. Note that the last three digits (239) match the "Test" entry above, which means that both of them, as files, plus the "Test" line above would have to be moved (and only after the local copy of Prime95 was halted). Thus, you move individual checkpoint files, as such, along with moving and removing individual lines from the 'worktodo.txt' file.

Should I buy one or more computers to run GIMPS?

Buying machines just for distributed computing (a "farm" when it is more than one) is a very personal decision. Some people get the 'bug' very badly and do buy their own machines (often "stripped down" in various ways so that they really are just for GIMPS). This is a volunteer project. The original intent was to simply use idle cycles on existing machines bought for another reason. They owe you nothing in particular except a "thank you". They did not ask you to spend money on them. If you buy a machine just for this project, you must be prepared to see arbitrary changes made to the client software or the project for reasons that are not today obvious. Perhaps they may change the scoring system to invite different work to be more prominent. New algorithms may arise that change the importance of your existing work in unexpected ways. None of this has happened and it is less likely for GIMPS than other projects. But, it has happened on other projects and one can never be 100 percent sure. You have been warned. That said, there's a lot to be learned about building systems on the cheap, running Linux, and overclocking standard Intel or AMD boxes that come from this.

What is "caching" or "queuing"?

Amongst diehard DC fanatics, caching means some amount of added work units are downloaded to your computer, and will be started when the current work finishes. Prime95 calls this by another traditional name "queuing". This also means that work will continue even if PrimeNet is unavailable, even unavailable for many days. One of the settings is "how many days of work to queue up." Since checking out an exponent reserves it for a minimum of sixty days, this value can be set fairly high. The default is twenty. Experience will reveal if you should raise or lower this value (which will vary based on the kinds of tests you choose to run). Lucas-Lehmer testing doesn't require more than one or two extra candidates queued up. Trial Factoring, since it may finish quicker at times, probably would do well with four or five.

What is a "WU" or "work unit"?

A distributed computing project needs a problem that can be chopped up into little bits and, well, distributed. The little bit each of us gets has become known as a "work unit". It is just the right size - reasonable to download, but containing enough "stuff" so that our program can spend a lot of time doing worthwhile calculations.

In GIMPS there is more than one "kind" of work unit and the statistics for these are kept separately.

What is a Mersenne prime?

In mathematics, a Mersenne prime is a prime number that is one less than a power of two. For example, 3 = 4 - 1 = 22 - 1 is a Mersenne prime; so is 7 = 8 - 1 = 23 - 1. See Mersenne prime.

What is an "outage"?

This is a real-world project. That means that while the PrimeNet site is surprisingly robust, it is sometimes unavailable for many hours at a time. Overall, it has been extremely available. Since the program can be set to "cache" work, even if it is unavailable for several days, this should not affect your ability to continue to crunch. The total bandwidth to communicate with PrimeNet is very small due to the nature of the project. Thus, you can crunch on even if PrimeNet was far less available than it actually is.

What is CLI? What does Command Line Interface mean?

The command line interface simply means that the GIMPS clients program is launched, DOS-style, from an ordinary command line. The Linux version always executes this way.

What is the lifetime of a work unit and why does it matter?

At present, the default lifetime of a work unit is sixty days. At this time, the work unit (whether Trial factoring or Lucas-Lehmer test) is subject to being "unreserved" and handed out to someone else. You can, however, extend this deadline if you wish. The reason this matters is, if you are late, you don't really want someone else to start work if you are going to finish in sixty five days after all. Since reporting progress is, generally speaking, optional, it is quite possible for a candidate to be checked out for Lucas-Lehmer and see no update of its status until the work is done. Thus, keeping your work status up to date is helpful to everyone. This is why the Prime95 code tries to report status once in a while.

Also, once work units (especially Lucas-Lehmer candidates) are returned, many users like to grab them. By the ordinary nature of mathematics and the project, Mersenne candidates get larger and larger. Older exponents that are "not done" are obviously smaller. Since many participants are motivated not by raw stats, but by actually obtaining an actual Mersenne prime, the smaller the exponent, the better. Thus, broadly speaking, you'll never have a better chance of finding an actual prime than whatever it is you have queued up already.

What kinds of Work Units are there?

There are four kinds of standard work units:

  • Trial factoring (often just called "Factoring")
  • Lucas-Lehmer primality tests
  • 10 000 000 million digit Lucas-Lehmer primality tests
  • Double checks (which is just a formalized re-run of Lucas-Lehmer, usually on a slower machine). How these checks work is well-covered on the regular GIMPS pages. We simply treat them as things we need to do here.
  1. Trial factoring is a relatively short computation whose goal is to eliminate potential Mersenne candidates in a couple of days instead of a couple of weeks. Many Mersenne candidates are multiples of relatively small primes. Trial factoring discovers if there is such a number. If it is, the more expensive Lucas-Lehmer test isn't needed. At this writing, computers between 300 MHz and 1 GHz are probably the best to deploy in Trial factoring.
  2. The Lucas-Lehmer test definitively establishes whether a given Mersenne candidate is a prime or not. At this writing, a candidate typically runs for weeks on a fairly fast PC (say, of about a GHz or faster, especially Pentium IVs, since the assembler code exploits SSE2 instructions). Most candidates fail, but if you want the glory of being a "finder" you have to run this test.
  3. 10 000 000 million digit Lucas-Lehmer primality tests are basically the same as standard primality test, but test the primality of numbers with more than 10 000 000 digits. These work-units take a very long time to complete. For a PIV at 2.4GHz you should expect more than 40 days before it completes. The Electronic Frontier Foundation is offering a $100 000 awarded to the first person or group to discover a ten million digit prime number. If you find such a prime with the software provided, GIMPS will claim the award and distribute the award according to the rules published on the GIMPS.
  4. Double checks. A double check is overwhelmingly likely to be a rejected candidate. However, the computations involved are so extensive, the GIMPS project runs even rejected candidates a second time just to be sure. Moreover, until the double check agrees with the original Lucas-Lehmer test, credit in the GIMPS statistics for the former is provisional. Any machine can be deployed on this test, though old, slow machines can take months to complete their work.

In addition to the above work units you can also ask George Woltman for some very large candidates to test. Some of these numbers will take up to a year to crunch on a a very fast CPU.

Who is behind www.mersenne.org?

GIMPS is one of the oldest distributed computing projects (probably "the" oldest -- the oldest known at any rate). There are many, many contributors (see Credits for a large but incomplete listing). However, the real "father" of this effort, at least as we run it, is George Woltman.

Who is this Mersenne anyway?

See page on Marin Mersenne.

Why doesn't the GIMPS client use multi-threading?

It does. Simply run two copies of the program. This is described in the readme.txt file. There is no productivity advantage whatsoever in having a multi-threaded version of Prime95 over running two copies of the program. Projects like GIMPS are so ideal in their CPU consumption, it is very easy to consume all of the available CPU without resorting to the added complexity of multi-threaded technique. This is explained in the regular GIMPS FAQ as well.

Will GIMPS succeed?

It already has, several times. See the page on GIMPS milestones and Mersenne primes.

See also

External links