Multicore in the Age of the Unthinkable

post time 18. June 2009 member admin

Recently, I had the opportunity to read and finish in one weekend, ‘The Age of the Unthinkable’, by Joshua Cooper Ramo.  My attempt at a quick summary:

In the past, world affairs was driven by nation-states and the smaller number of players at this level of granularity mixed with less communication (both frequency and volume) made it possible to strategize, control, and influence this system.  However today, with the volume of change and amount of communication between orders of magnitude more people, this sort of nation-state actor strategy is insufficient and leads to unpredictable and oftentimes the opposite of expected results, e.g. actions to counter terrorism leads to an increase in terrorism.  Ramo posits that to counteract the negative forces in the world requires a strategy that is immune system-like in its response, a creative and multi-pronged approach.

 

As a computer scientist, I understood the concept of a world too complicated to predict outcomes.  Most computer scientists are exposed to Conway’s Game of Life (http://en.wikipedia.org/wiki/Conway’s_Game_of_Life) very early in our educations.  Another, more esoteric example is trying to predict when a neural network being trained by backpropagation will all of a sudden reach equilibrium.  I recall in both cases looking at these types of programs and just thinking, ‘Wow.  Amazing.’ - Simple actions by a large number of entities creating surprising results.

 

Now turning this to my work in multicore – I gave a talk last year (http://www.zurich.ibm.com/csc/software2008/domeika.html) where I referred to a thought of my colleague Tim Mattson who compared what the industry is doing in multicore with the Draeger Grocer Store Experiment.  The conjecture is that perhaps we are pursuing too many ‘solutions’ to the problems of multicore software development and as a result are confusing customers.

 

Putting this all together – I’d like to suggest that all of these potential solutions are needed and inevitable.  Over time, the successful techniques will emerge; customers will move toward and employ the best ones.  Now it would not be fun to be on the side of one of these losing technologies so the question is how can one help encourage their particular technology to win.  This is where a multi-pronged strategy and tactical approach is needed.  Here are some thoughts on some items that are typically thought of as lower priorities compared to the obvious goodness of your particular tech:

  • Ease-of-use – There is a natural tradeoff between ease-of-use and performance.  The level of tradeoff is determined by the programmer’s preference and ability.  That said, bugs, poor documentation, and poor diagnostics are things that can make a technology harder to use than it really should be.
  • Easy to understand – If it’s difficult to explain why your particular technology should be used over another, you’ve got a problem.  It may be the case that only a handful of techies at a customer company understands the details, however being able to distill the pros/cons of a technology is critical.  For example, in 30 seconds, why would you use OpenCL over OpenMP or Pthreads?
  • Education – Customers need outlets to gain knowledge on a particular technology - typically, the more venues available for this learning, be it onsite classes, books, webinars, blogs, etc. … the better.  Psychologically, querying the web and seeing numerous links available with this type of information can be reassuring.
  • Open Standards – Customers frankly like choice.  Open standards tend to foster more choice with regards to implementations.  Is your solution proprietary?  If so, should you consider standardization?   

 

Best regards,

 

Max

 

As an aside, I’d like to recommend a book from a colleague of mine.  Clay Breshears has recently published The Art of Concurrency.  I’ve taught classes on multicore with Clay.  In fact, he was the author of much of the content from which we taught.  I’ve read some of the book and can say that it reads well, explains concepts clearly, and in the end makes understanding concurrency and multicore programming easier.  Congratulations Clay on a well written book.

Category Uncategorized | 1 Kommentar »

Post-ESC Impressions

post time 13. April 2009 member admin

I had a good trip to Embedded Systems Conference, spending several hours on the expo floor, giving two talks in the general conference, and chatting with a bunch of colleagues.  Here’s a rundown on what I found noteworthy:

  • Keynote by T.K. Mattingly - I could listen to a former astronaut during the golden age of NASA for hours on end so I enjoyed this thoroughly.  One of the things he said that resonated with me was something he had heard from a launchpad engineer (not quoting quite right): “this is not going to fail because of me.”  Takeaway: in big engineering projects, you may not understand everything that is going on, but you should know your role and do it well, simple as that.
  • Expo floor - Show seemed smaller by about 30% this year.  Adjoining hall that was packed with vendors in years past was unused.  Floor traffic on Tuesday seemed light, Wednesday much better.  NI’s booth is always amazing.  Visual acquisition, signal processing, and robot control using a trendy game for inspiration.
  • With regard to multicore, CriticalBlue’s Prism tool was a standout.  The ability to perform ‘what if’ modelling of expected performance gain on different parallelization scenarios is very compelling.
  • My talks: Gave two “Debug Tools, Technologies & Techniques in the Multi-core Era” and “Case Studies in Software Optimization of Multi-core SMP”.  Both talks were decently attended, about 20 per, which was a big question mark going in with the financial crisis and all.

Regarding my talks, here’s the gist of what they are about and what I’d consider to be the compelling portions of each:

  • Debug Tools … - Survey of software development tools & techniques for debug of multi-core.  What is compelling: I suspect several in the audience learned something either about a new tool or a new technique to which they had not been exposed previously.  There is no silver bullet, one tool solution to multicore debug.  Instead you need to apply a number of techniques and tools to the problem.
  • Case Studies … - Review of the Threading development process and application of it to two real world applications.  What is compelling: attendee sees steps in the process making sense as it guides what is done in the optimization of the application.  For example, the initial performance analysis helps you learn the application and feeds into what portion you should focus on for optimization.  It may seem obvious, but seeing it on a real application, not a toy program is quite nice.

Best regards, Max

Category Uncategorized | 0 Kommentare »

MPP: On Documenting Best Known Methods for Multicore

post time 12. March 2009 member admin

I don’t mind the travel restrictions imposed by the economy because I’ve been fortunate to not have to travel until now.  Next week, I’m taking a day trip to Santa Clara to attend the Multicore Association Board Meeting and to discuss status on the working group I co-chair, Multicore Programming Practices.

The group, comprised of technical leaders from a variety of embedded software companies have been iterating on an outline of the document for the last 4 months.  The outline which weighs in at ~30 pages and 7 chapters is structured after your typical software development projects, e.g. analysis, design, debug, and performance tune.  The team has now split up to tackle the writing of 3 of the chapters, those focused on 1) an overview of available technology, 2) analysis and high level design, and 3) performance tuning. 

The challenge the group has is in trying to sufficiently explain the material detailed in the outline while staying in line with the targeted page budget.  Very early on, David and I wanted a document that was more than a whitepaper, but much less than a book, so ~100 pages felt about right.  What this means is that we’ll be trying to distill the need to know information into for example about 20 pages for the analysis and high level design chapter.  The team will obviously reference backing material where necessary, but I suspect the highly technical engineers on this project will want to explain topics in minute detail and will be challenged to be brief. 

I’m looking forward to reporting our progress at the board meeting on 3/16 and also seeing the results of the initial writing that will be completed this month.  

Best,

Max

Category Uncategorized | 1 Kommentar »

Embedded multicore interview followup

post time 20. February 2009 member admin

Had a nice interview and mention in an article by John Blyler.

With regard to the issue identified in the story and video (coordination of process, thread, and vector level parallelism), I think there is a positive and a negative aspect for embedded developers.  The negative aspect for embedded developers is that much of the work for addressing this issue is being done in the desktop & server space.  The balance between OpenMP and automatic vectorization mentioned in the article is available in a compiler for desktop and server.  OpenMP is not supported on traditional embedded OSes to my knowledge.

The positive aspect is that embedded developers typically have more control over the other applications that may be executing on the system.  For example, you may have one process that is taking full advantage of the number of cores on the system, however if there is another application that is doing the same, you will typically end up with non-optimal performance.  On a desktop system, a developer doesn’t typically know the other applications that a customer may choose to execute.

An embedded systems developer would have better knowledge of other apps on the system and would be in a better position to do something about it.

Just thought I’d share some further thoughts on the interview.

Regards,

Max

Category Uncategorized | 0 Kommentare »

Embedded Multicore Debug

post time 9. February 2009 member admin

Early February finds me working on talks and papers for Embedded Systems Conference Silicon Valley.  I really do enjoy attending the conference every year and they really do treat their speakers well.  I’ve had the opportunity to hear keynotes from Al Gore and Dean Kamen (Segway inventor).  This year’s keynote is from Ken Mattingly of Apollo 13 fame.  How cool is that!

Of course, being a speaker at the conference involves real work, putting together a talk that will be appreciated by embedded developers with different experiences, backgrounds, interests, etc.

Over the past two weeks, I’ve been working on the first of my two presentations, a talk on embedded multicore debug.  This topic is very broad and I’m not an expert in all of the areas so I’m learning a great deal as I put together this talk.

What I’ve found is that multicore debug is not a solved problem.  It is very difficult and the technology for helping is somewhat early in maturation.  A positive spin - there are lots of opportunity for innovation in this area.  My talk will cover technology such as static analysis, simulators, thread verification tools, and hardware assisted tracing.  It’s a beginner level talk so can’t go too deep in each of these areas.

Anyone care to disagree with my statement - multicore debug is not a solved problem?

Best, Max

Category Uncategorized | 2 Kommentare »

5 months in multi-core - what has changed?

post time 27. October 2008 member admin

This week found me in Zurich, Switzerland, delivering a talk to researchers. The purpose of my talk and the other talks at this symposium was to share what different companies and researchers are doing to help “solve” the multi-core programming software challenge.

My content was similar to a talk I delivered in Japan earlier this year. The topic of my blog will be a reflection on what, if anything with regard to multi-core has changed in 5 months.

Just for grins, here’s the abstract of the talk:

The state of the art for optimizing and programming for parallelism on multi-core processors is evolving with many programming models being offered as the possible “solution” that software developers should use. Some would argue that there are perhaps too many such solutions being considered and some consolidation should occur. This talk shows the multicore programming technologies both currently available and being evaluated in the Intel® C++ Compiler. We’ll look at some different parallelism methods, such as software transactional memory, OpenMP 3.0, array notations and offer insight into what is guiding development of each.

Probably the most interesting change for my talk in the last five months has been the announcement of the Intel Parallel Studio. The toolkit is comprised of four different tools: Intel Parallel Advisor, Intel Parallel Composer, Intel Parallel Inspector, and Intel Parallel Amplifier.

Of course I work at Intel so I know a few more details on these tools, but cannot share them at this point. However, on the surface, I’m very excited by the Advisor tool which aims to “Gain insight on where parallelism will benefit existing source code.” I believe this is a key area of the multi-core development cycle that has relatively little tools support today. In addition, the tool targets developers who cannot necessarily throw their current implementation away and redesign. This particular theme lines up with my motivation and work with David Stewart on the Multicore Programming Practices working group which targets existing and legacy applications.

I believe other software vendors are developing or are soon to make available tools with similar capabilities. This is good news. The availability of this type of tool can only help programming for multi-core so I’m excited to test drive them as soon as they are available.

On a personal note: Zurich is a beautiful city. One cool portion of my trip was attending Sunday service at the Fraumunster Kirche (building with the large clock tower in the background). Amazing.

Category Uncategorized | 3 Kommentare »

Thoughts on the endstate of multicore software development

post time 19. September 2008 member admin

I recently read two interesting articles on multi-core programming from different angles. One is from a noted compiler writer who is a creator of development tools that enable multicore/multiprocessor programming:

http://www.hpcwire.com/features/Compilers_and_More_Parallel_Programming_Made_Easy.html

The second article is an interview with a noted game programmer, who would be a consumer of such development tools:

http://arstechnica.com/articles/paedia/gpu-sweeney-interview.ars/

Both of these are good reads. Both of the articles make arguments about the need for easier parallel programming. Dr. Wolfe posits about what is realistic from the point of view of a creator of tools. Mr. Sweeney discusses what customers need and essentially argues that the development tools need to do the lion’s share of the work.

I believe that for multicore processors to significantly impact the industry (impact means most customers derive benefit, which means most developers take advantage of parallelism in one form or another), the end state for parallel programming is that it will blend into the background and in a sense be taken for granted by developers.

In addition, I think it follows that any mass-market product that has a parallelism-centric purpose is a sign that we are not at that end state yet.

Take for example, Intel Threading Building Blocks and the recently announced Inte® Parallel Studio. Intel TBB is a great library and I think Intel Parallel Studio will help customers tremendously. But the question is … does the average developer care enough about multicore processors to take on these parallelism-centric tools?

I suspect it may be a bit too much for the average developer, but I do think tools like these will broaden the developer base taking advantage of parallelism and perhaps that is the best that can be asked.

Putting it all together – perhaps the end state is one where:

  • the average developer (apprentice) doesn’t have to care about parallel programming to take advantage of multicore – they derive benefit from domain specific parallel libraries
  • the experienced developer (journeyperson) increases the performance of their application by using 1st class language support for concurrency and is able to take good advantage of features in parallelism-centric tools where needed.
  • The expert (architect) designs-in use of key concurrency features in their application and is able to wield these parallelism-centric tools with ease.

Thoughts?

Max

Category Uncategorized | 4 Kommentare »

TBD: The Perfect Architecture Migration Guide

post time 20. August 2008 member admin

I have a presentation to give in about two weeks regarding best known methods on architecture migrations. As usual, my initial research consists of using google and searching for “architecture migration” with various additional terms to constrain the search to microprocessor migration and not anything associated with buildings or structures and whatnot.

I found some interesting reads, but no clear cut template on what a complete architecture migration guide should specify.

Some of the interesting hits included:

The first two links detail case studies in performing a migration. This is nice because it shows a real example. The drawback to case studies of course is in sometimes being too specific; if your planned migration doesn’t match what’s detailed in the case study, you may be no better off for having read. The last link is a website with a number of articles on the topic of migrating and seemed motivated by the move to the Intel 64 ISA from 32-bit platforms. The linked papers on the site provide some helpful information, but are topic specific and either hit or miss.

If you are performing a migration to Embedded Intel Architecture a good resource is chapter 4 of my book, “Software Development For Embedded Mulit-core Systems.”

http://www.elsevier.com/wps/find/bookdescription.cws_home/714327/description

I don’t mean to toot my own horn, but the chapter was written with such a migration in mind so it’s pretty complete. Of course, there’s always room for improvement. The chapter provides a good general overview of migrating to Intel Architecture. I believe it could be improved by providing a comparison/contrast on the architectures from which you’d be migrating such as ARM, PPC, MIPS, and a couple of case studies.

With that in mind, here’s a high level outline on what I think is needed to fully cover the topic of architecture migration.

  • General introduction
    • Discuss Application-level vs. System-level migrations
    • Compare/contrast architectures
    • Compare/contrast operating systems
  • Discuss choosing an operating system and pros/cons of each
    • Proprietary – pros – if it’s your OS, you should be familiar with it. Cons – must roll everything yourself, especially support for SMP, next instruction sets, tools
    • Linux – pros – high level of support, lots of choice, dev machines can be same as target OS – cons – may not meet real time requirements, code rewrite probably required for system level stuff
    • Embedded Linux
    • Embedded OSes (VxWorks, FreeBSD, QNX of the world)
  • Boot/Bios – Discuss bootup/BIOS
  • 32 vs. 64 bit – Compare/contrast
  • Architecture extensions – Discuss SIMD instruction sets
  • Endianness – Discuss big & little endian and possible issues in porting code. Some solutions to the problems.
  • Multicore specific – Discuss SMP or AMP support and ramifications
  • Tools Support
    • Compiler, debugger – typically required
    • Performance analysis, threading tools – typically nice to have
  • Case Studies
    • System-level migration running the gamut of embedded issues
    • Application-level migration focusing on SW issues (library support, etc.)

Thus, there it is - the perfect migration guide. The problem: someone has to write it. Perhaps, this will serve as input into a 2nd edition of my book J.

Category Uncategorized | 0 Kommentare »

Excellent Brief Read on Multicore Software

post time 29. July 2008 member admin

I recently downloaded and read “How to Survive the Multicore Software Revolution” by Charles Leiserson and Ilya Mirman (http://www.cilk.com/multicore-e-book/) and was quite impressed.  If you are new to multicore software development or would like a brief refresher on the topic, the article is a quick 30 minute read to get you up to speed.

Some strong points about the artice:

  • Very concise coverage of the challenge with multicore, basic performance terms, parallel programming terms, and race conditions in the introductory sections.
  • Good simple critique of roll your own multithreading - I agree when you do this typically your parallel version becomes very different than the serial version and typically scalability is limited due to the developer more or less hard coding for one multicore processor.
  • Good summary of existing concurrency platforms.
  • 20 questions cover several aspects of software product development, not just the technical ones
  • Loved the bios - always good to add a personal touch

With the positive out of the way, my two main nits, perhaps they boil down to one which is the paper seems focused on homogenous multicore:

  • Seemed to be a bit Intel focused - with respect to the company I work for, this is fantastic and I suppose you could make the case that many/most of the multicore in desktop/server is x86, but there are other programming models and architectures. 
  • I believe many customers are eschewing multithreading and partitioning/multiprocessing instead.  It may have been good to comment on why this approach won’t necessarily scale in the long term.

That said, again I really enjoyed the article and recommend it.  It has given me some things to think about as we pursue the Multicore Programming Practices guide in MCA (http://www.multicore-association.org/workgroup/mpp.php).

One of our challenges in the group will be giving good coverage of best practices for several different multicore architectures split among multiple vectors, e.g. homogenous/heterogeneous, shared/distributed memory, symmetric/asymmetric.  It will be interesting to see how/if the practices generalize.  

Max  
 

Category Uncategorized | 4 Kommentare »

List of the ‘happening’ Embedded conferences

post time 20. July 2008 member admin

Who knows where and when all of the good embedded focused conferences are?  I was a little disheartened because I believe I have a good feel for the pulse of the industry, but perhaps not as much as I’d thought.

I was in Las Vegas last week for some vacation (NBA Summer League) and learned there was a pretty large technical conference occuring at the same time that I missed.  The conference was WorldComp’08 (http://www.world-academy-of-science.org/worldcomp08/ws) and featured tracks on embedded, communications, and parallel processing, to name a few.  Looking at the agenda online, this would have been a pretty good conference for me to attend and this got me wondering: are there other conferences coming up in 2008 that I should know about and that provide a good pulse on what is going on in embedded.

With that, I’ll share my list of what I know about.  I’d be happy to know about pointers to others so please feel free to share:

- NI Week, Austin, August 5-7
- Embedded Systems Conference, Boston, October 26-30
- Multicore Expo, Tokyo, November 6-7

Best regards,

Max

Category Uncategorized | 1 Kommentar »
 « Next  

Domeika’s Dilemma is powered by WordPress
Theme is Coded&Designed by ricdes dot com