Project: 20th Anniversary Pentium G3258 – Part 2: Overclocking & Benching

As an engineer, I get excited whenever I have any form of optimization to perform. Overclocking is a kind of optimization, depending on how you do it – in this case it’s attempting to extract the maximum clock rate, with the lowest core voltage (and heat) practicable, maintaining a level of stability which is high enough for everyday use, using parts of the lowest cost, within the constraints of the stock CPU cooler, paste and weak VRMs on the motherboard. Having previous experience with Haswell overclocking, I dove right in and tried my hand at several things.

Initial Experiences

From my initial experiences with the Asrock Z97M Anniversary, overclocking was slightly more difficult than I would have liked. Due to some conflicting settings somewhere (don’t know what), despite selecting the overclock defaults provided, the board booted into Windows at stock speed and could only be overclocked with their A-Tuning software. Resetting the CMOS using the UEFI Setup Utility didn’t help. The strange thing was that the UEFI utility insisted the CPU was overclocked, and everything else insisted that it wasn’t, including A-Tuning.

It was only after clearing the CMOS with the hardware jumper that I managed to get the board to boot with the overclocked settings.

I started with the pre-sets, but soon opted to “go it my own”. Important parameters to tweak are:

  • CPU Multiplier (selects the Core speed)
  • Cache Multiplier (selects the Uncore speed)
  • GPU Clock Rate (overclocks the integrated GPU)
  • CPU Core Voltage (selects the voltage the FIVR gives the cores)
  • CPU Cache Voltage (selects the voltage the FIVR gives the uncore)
  • Current Limit (allows for higher power usage)
  • Turbo Power Limit (allows for higher power usage)
  • FIVR Operating Mode (select high performance over efficiency)
  • FIVR Faults (disable to prevent issues)
  • GPU Voltage (selects the voltage the FIVR gives the GPU)
  • RAM Clock Base (selects 100/133Mhz based RAM increments)
  • RAM Rate (selects the multiplier)
  • RAM Latencies (to fine tune the RAM)
  • Analog I/O voltage (to add stability for RAM overclocking)
  • Digital I/O voltage (to add stability for RAM overclocking)
  • System Agent voltage (to add stability for RAM overclocking)
  • DIMM Voltage (changes the voltage to the RAM)

Ultimately, given the price of the CPU, I didn’t mind killing it at all, so I went through it in a rather cut-throat way.

Overclocking Methodology

The methodology behind overclocking Haswell CPUs can be simplified to five stages. By following them in order, it is possible to reduce the number of headaches and “I don’t know what caused it to BSOD” problems, while also reducing the time it takes to reach full stability.

0. Set the BIOS settings

The first thing to do is to set the BIOS settings correctly. I would suggest you try to load a low overclock default as a basis if you’re not comfortable setting them from scratch. You will need to disable current limits, faults and identify where your multipliers are. You need to check and verify (say, with CPU-Z) that your board is doing what it says it’s doing (setting multipliers correctly). Also consider setting the fans to full speed to avoid losing cooling performance. Also, you might want to disable thermal throttling (at your own risk) to ensure the CPU performs at the fastest speed at all times regardless of the high temperature.

1. Optimize Core

The first overclocking stage is to optimize the core independently of everything else. By this, it would be fastest to increase the core voltage to the highest you can accept (likely to be 1.400v or less for air cooling) within thermal limits (check temperatures using HWMonitor or CoreTemp), and then increase the multiplier until it no longer boots. Then dial back the multiplier, and try to pass 24-hours of Prime95 mixed torture test. Once you can pass 24 consecutive hours, then you know you’re stable at that given multiplier.

Remember that the temperatures reported are junction temperatures, and the temperatures reported on the datasheets are case temperatures. It would be wise to stay under 95 degrees C at the peak, and others will advise you to be even more conservative. But if it doesn’t crash, you’re pretty much good as real life workloads are unlikely to cause as much stress as Prime95 will.

Once you have this, it would be advisable to start the voltage optimization routine, by repeating the above but instead of increasing the multiplier, your job is to reduce the voltage in small steps (0.025v at the most, preferably 0.010v) until it breaks, and then go back up one or two steps. This will reduce the heat produced and improve the electrical efficiency of your system.

2. Optimize Uncore

The next step is to optimize the uncore – this is the multiplier which controls the cache. While the chip is shipped normally running both core and uncore at synchronous rates (i.e. 32 and 32), for high overclocks, it is likely that this isn’t possible. Uncore seems to top out at 39-42 depending on the chip.

While it isn’t an absolute necessity to run the cache synchronously with the CPU due to increased cache bandwidth, there are slight performance benefits to overclocking the uncore. As a result, you should increase your Uncore, starting from 38-39 and see where it fails the 24-hour Prime95 test, and then back it off.

The voltage to the uncore is generally less important as it contributes less heat, I would suggest setting it to 1.2 or 1.25v and leaving it.

3. Optimize RAM

The third step is to optimize the RAM. It’s helpful if you have the stock latencies from the SPD at hand (see CPU-Z). Generally, the Intel CPUs benefit greatly from increased RAM clock rate even at the expense of latency.

To prepare for RAM overclocking, I would suggest increasing the Digital, Analog I/O voltage and System Agent by +0.2v and the Vdimm to 1.65v and leaving it there. Then, start pushing the clock rate while manually adjusting latencies as not all boards will make sensible decisions. Do change the DRAM base clock to 133Mhz if that helps you eek a little more out of your RAM. You should also consider changing to 1T/1N from 2T/2N to improve performance as well.

To work out the latency, I’ve made this handy table (especially handy for cheap RAM):

Clock-BasisFor example, lets say we have cheap DDR3 1600Mhz RAM with a latency of 11-11-11-28 at 1600Mhz. You can see that the CAS latency (first number) corresponds to ~6.88ns. If you want to try running this at 1866Mhz, you look up the number nearest to this and find a CAS latency of 13 at 1866Mhz corresponds to 6.96ns which should be safe. Likewise if you then decide to push it to 2400, you should probably try CAS latency of 15 to keep a similar amount of “absolute” time.

How do you determine the big number at the end? Well, as a very rough rule of thumb, you should be using something around twice the CAS latency + 1 or 2 as a minimum. So if you’re trying 15-15-15-?, the ? should be 31 or 32 as a minimum.

This gives you a nearly safe starting point to decide what manual timings to apply.

Then you can tighten these and see if you can still keep it stable. Memtest-86 can be helpful, although this needs to be followed by a Prime95 run to be sure. Note that Memtest doesn’t show the right CPU clock rate – this is normal.

The important thing to realize is that overclockers term this loosening the timings, but you need to remember the timings are the number of clocks at a given clock rate. If you increase the clock rate, it takes more clocks for the same amount of time to pass. As a result, you might still get a net benefit if you increase your clock rate (bandwidth) even if you have to loosen your timings.

4. Optimize GPU

The final piece is to optimize the GPU. For this, I like to push the voltage up to about 1.300v and then move the clock up in 100Mhz increments and run a 3D test application (even the Windows System Assessment Tool is enough in most cases). When the GPU is unstable, the graphics subsystem driver resets and the screen blinks. Back off by 100Mhz once you hit this point and you’re often fine. The GPU can be tweaked in 50Mhz increments if necessary.

It might seem complicated at first, and in some ways, it can be. But if you’re willing to live with a more limited overclock, then the existing “one touch” settings might be more to your taste.

Finalized Settings

As it turns out, here are my finalized settings which pass 24-hours of Prime95 with the stock cooler never exceeding 91 degrees C:

  • Core Multiplier: 46x (stock: 32x)
  • Cache Multiplier: 40x (stock: 32x)
  • Vcore: 1.375v (stock: 1.090v)
  • Vcache: 1.200v
  • Boost Power Max: 1000
  • Short Power Max: 1000
  • GPU Clock: 1700Mhz (stock: 1100Mhz)
  • GPU Voltage: 1.300v
  • System Agent: +0.2v (stock: 0v)
  • Analog I/O Voltage: +0.2v (stock: 0v)
  • Digital I/O Voltage: +0.2v (stock: 0v)
  • RAM Clock: 2666Mhz (stock: 1600Mhz)
  • RAM Timings: 14-14-14-30-1T (stock: 11-11-11-28-2T)
  • RAM Voltage: 1.65v (stock: 1.50v)
  • CPU Fan: 100% Fixed Mode

The proof is in the screenshot:

46-40-1375-1200

Looking at the stock VID of 1.090v, the chip is considered an “okay” to “good” chip. It’s not an excellent gem, and that’s noted by the high 1.375v core voltage required to achieve 4.6Ghz. Anandtech’s sample managed to do 4.7Ghz at 1.375v – so mine’s not far off the beat.

The biggest surprise was the level to which the Samsung OEM RAM (which has no heatspreaders or any fancy features) managed to overclock which simultaneously pushing the timings somewhat (despite how it appears). This actually has a positive performance benefit – as I will demonstrate in the Benchmarking section.

One unfortunate thing seems to be that this is the limit of the CPU and the motherboard as well. The VRMs are squealing fairly loudly under heavy computational load, normally indicative of slight overloading or stress. Nothing’s blown up or gotten too hot yet … but it means we are at the limit of the cooling, VRMs, CPU and RAM almost simultaneously – the kind of coincidence which helps you optimize every last drop from your hardware.

Validation Fun

Before all of that serious benching, I wanted to see just how fast this thing can go, irrespective of long term stability. The way to achieve this is to push the voltage up to the maximum you (and your board) can tolerate on a short term, and then push the core rate up until you freeze and or fail to submit your validation. For me, my CPU managed this impressive performance on both cores:

ipnrce

My first validation, and my first CPU at 5Ghz. This is a pretty cool present, Intel!

Benchmarking

Looking at the raw clock numbers are rarely entirely indicative of performance gains from actual use cases. As a result, it is important to benchmark with a wide suite of applications which reflect your actual uses and may be subject to bottlenecks from various other components within the system. It also provides us an opportunity to investigate the performance difference of running DDR3 1600 at 11-11-11-28 to DDR3 2666 at 14-14-14-30 (fairly “lousy” latencies by 2666 standards).

Unfortunately, there are many many benchmarks available and it would be impossible to cover them all. Others have some restrictive licensing agreements or software limitations that make them less useful. The benchmarks used below are:

The first results column represents running at stock speed with stock RAM, the next column is running at overclocked speed with RAM at 1600Mhz 11-11-11-28. The column next to it represents the percentage improvement. This is repeated for running overclocked with the RAM at 2666Mhz 14-14-14-30.

BenchResults

Overall, it seems that running the RAM with the looser timings but higher clock has a nearly 5% performance advantage on average. Some benchmarks showed no improvement or even marginal decrease in some cases, however. Overall, the CPU increased in frequency by 43.75% and the benchmark results raised by 34.34-39.12% which is a fairly significant amount, although slightly less than the core frequency increase would suggest.

Conclusion

It seems that my overclocking has yielded fairly significant benefits, although not without some effort. The CPU managed to get to 4.6Ghz on both cores, stably, under stock cooling and push a 1600Mhz DDR3 pair up to 2666Mhz. Quite impressive indeed for the price.

In terms of performance, the Pentium has been elevated into a mid-high Core i3 level, sitting between the 4350 and 4360 in terms of Passmark CPUBenchmark scores. It’s a decent chunk of performance for a basic machine, and a decent price. Add to that, a weekend of fun for me, and that’s money well spent!

Again, I will state, if you have no intention of overclocking, the CPU is a pretty “regularly” priced low-end CPU with no real distinguishing features. But if you do, you can definitely get some improvement for free – just don’t expect it. It’s not for everyone.

About lui_gough

I'm a bit of a nut for electronics, computing, photography, radio, satellite and other technical hobbies. Click for more about me!
This entry was posted in Computing and tagged , , , , . Bookmark the permalink.

6 Responses to Project: 20th Anniversary Pentium G3258 – Part 2: Overclocking & Benching

  1. rasz_pl says:

    Hi.
    Have you seen this http://forums.anandtech.com/showthread.php?t=2389948 ?
    I got myself asrock H81M-DGS (~$50) and it runs 4.3GHz at 1.2v on stock.
    320 in Cinebench R15 with some crappy ram.

    I would never pay more for motherboard than cpu – whats the point? You could very well bought cheap h81 board ($40) with i3-4160 ($130) and not overclock at all.

    • lui_gough says:

      No, I haven’t, although I was aware that some motherboards (non-Z) did have overclocking enabled by Beta BIOS (and subsequently, angered some other motherboard markers).

      Overall, I see no issue with paying more for a motherboard than the CPU – I did it all the time back in the AMD days when Sempron 2600+ chips were $28 and motherboards were $45, although if there are cheaper boards fitting your needs, it’s well worth buying it. Unfortunately, most of the stores I frequent have very few H-series boards, and they’re priced pretty close to their Z-series counterparts. Older boards (i.e. Z87 and before) are rarely available in the stores I frequent, and still cost similar to the Z97 boards.

      (EDIT: If you can find the cheaper older boards and pair it with a regular i3, that’s great too. Just don’t expect that to happen in Aussie computer shops which stock next to nothing to avoid having old inventory to flog off for cheap. Oh, and not to mention, I do it for the fun and the commemoration. As I said, it’s not for everyone.)

      – Gough

    • sparcie says:

      Personally I think it is well worth having a better mainboard whether you are overclocking or not.

      Choice of mainboard is probably the biggest factor in quality of a system as it will dictate what parts you can install over the life span of the machine. It will even change the type of processor you can install as many boards do not support all chips that fit in their socket. Some will require BIOS upgrades before you can install that high performance CPU.

      You often get better features for a start like extra/better SATA ports and sometimes better chipsets. This can make quite the difference once you start adding PCI-E devices that depend on the mainboard. Not a problem if you plan on using everything on-board. It all depends on the features you need.

      The real/best benefit would have to be better quality components in the voltage regulation and some other lesser scrutinized parts. In particular better capacitors offered on some brands such as the “solid” caps that can be found on better gigabyte boards. It gives you more headroom when increasing voltages and will generally increase the working life of the machine, something even non-overclockers can appreciate.

      Cheers
      Sparcie

  2. shadowvlican says:

    that was quite informative, thank you!

  3. bent540 says:

    Brilliant post! I will feed my brain with this.

  4. Željko says:

    Thank you for info !

Error: Comment is Missing!