Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Homewok2 - Compilers, Assignments of Compilers

CS6241 - 2025 - Santosh Pandey

Typology: Assignments

2024/2025

Uploaded on 04/21/2025

rohith-9
rohith-9 🇺🇸

1 document

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 6241: Compiler Optimizations
Homework 2
Important Policies - READ CAREFULLY
1. Georgia Tech honor code will b e strictly enforced.
2. This is an individual assignment.
3. Copying solutions from any source is an act of plagiarism.
4. Make reasonable assumptions and clearly state the assumptions made.
5. Please use Canvas Ed Discussion for any questions or clarifications.
6. Give enough details but don’t write essays be crisp and precise. Write solutions using pseudo-code
algorithms and notations. Please follow suggested page limits to each question.
Please take the page limits seriously. We will not grade past the given page limits. Please use latex, word,
or an equivalent software. Do not turn in handwritten assignments. The only exception is that you may hand
draw any graphs and insert these photos into the assignment as long as they are clear. A better option is to
use graphviz to create importable graphs or latex TikZ package.
IMPORTANT NOTE: If a problem asks you to make an improvement to an algorithm, you must make a
theoretical improvement. Data structure improvements do not count and will be given a zero.
Question 1 [20 points][1.5 pages]
Consider the problem of detecting available expressions for CSE.
1. Develop and reason about the safety condition for the Available Expression Analysis. That is, what
relationship must be true about the set of available expressions found by the analysis and the ones that
could truly exist in a program under all possible inputs.
2. In the presence of the pointers, revise the dataflow framework for Available Expressions for all different
cases of points-to sets such as : (1) an alias is precisely known that pointer p points to x , there exists a
downward exposed expression x+y which is included in the analysis and it is followed by *p = ... l-value
dereference. Case (2) is the expression p+y is downward exposed. Case 3 and 4: Redo both of the above
cases when p is aliased to x or z at the given program point but it is known that it is not aliased with
anything else. Case 5 and 6: Finally, redo the problem when p’s alias is statically unknown due to pointer
arithmetic, i.e, one is unable to rule out an alias of p to any variable which can participate in an expression.
Using the revised framework, show how you meet the safety conditions developed in (1). Illustrate cases
5 and 6 via examples.
3. Modify the available expression analysis to find available expressions on demand inside a given basic block
B this is called demand driven analysis mainly used for doing CSE only within loops. Note that here you
will not perform whole CFG analysis but will only focus on detecting availability of only those expressions
which are generated inside loop basic blocks for the purposes of redundancy elimination.
Question 2 [20 points][2 pages]
A very busy expression eis defined as the one that is anticipatable at a program point p, i.e., there is an
evaluation of eon every path that begins at pbefore the end of the function or redefinition of its operands. It
is proposed to hoist eat a program point pso that it eliminates original expressions maximally leading to code
size reduction.
1. Show an example which shows when is it legal to hoist very busy expressions up and when is it not. Also
show the case of maximal hoisting when hoisting is legal.
1
pf3

Partial preview of the text

Download Homewok2 - Compilers and more Assignments Compilers in PDF only on Docsity!

CS 6241: Compiler Optimizations

Homework 2

Important Policies - READ CAREFULLY

  1. Georgia Tech honor code will be strictly enforced.
  2. This is an individual assignment.
  3. Copying solutions from any source is an act of plagiarism.
  4. Make reasonable assumptions and clearly state the assumptions made.
  5. Please use Canvas Ed Discussion for any questions or clarifications.
  6. Give enough details but don’t write essays – be crisp and precise. Write solutions using pseudo-code algorithms and notations. Please follow suggested page limits to each question.

Please take the page limits seriously. We will not grade past the given page limits. Please use latex, word, or an equivalent software. Do not turn in handwritten assignments. The only exception is that you may hand draw any graphs and insert these photos into the assignment as long as they are clear. A better option is to use graphviz to create importable graphs or latex TikZ package.

IMPORTANT NOTE: If a problem asks you to make an improvement to an algorithm, you must make a theoretical improvement. Data structure improvements do not count and will be given a zero.

Question 1 [20 points][1.5 pages]

Consider the problem of detecting available expressions for CSE.

  1. Develop and reason about the safety condition for the Available Expression Analysis. That is, what relationship must be true about the set of available expressions found by the analysis and the ones that could truly exist in a program under all possible inputs.
  2. In the presence of the pointers, revise the dataflow framework for Available Expressions for all different cases of points-to sets such as : (1) an alias is precisely known that pointer p points to x , there exists a downward exposed expression x+y which is included in the analysis and it is followed by *p = ... l-value dereference. Case (2) is the expression p+y is downward exposed. Case 3 and 4: Redo both of the above cases when p is aliased to x or z at the given program point but it is known that it is not aliased with anything else. Case 5 and 6: Finally, redo the problem when p’s alias is statically unknown due to pointer arithmetic, i.e, one is unable to rule out an alias of p to any variable which can participate in an expression. Using the revised framework, show how you meet the safety conditions developed in (1). Illustrate cases 5 and 6 via examples.
  3. Modify the available expression analysis to find available expressions on demand inside a given basic block B – this is called demand driven analysis mainly used for doing CSE only within loops. Note that here you will not perform whole CFG analysis but will only focus on detecting availability of only those expressions which are generated inside loop basic blocks for the purposes of redundancy elimination.

Question 2 [20 points][2 pages]

A very busy expression e is defined as the one that is anticipatable at a program point p, i.e., there is an evaluation of e on every path that begins at p before the end of the function or redefinition of its operands. It is proposed to hoist e at a program point p so that it eliminates original expressions maximally leading to code size reduction.

  1. Show an example which shows when is it legal to hoist very busy expressions up and when is it not. Also show the case of maximal hoisting when hoisting is legal.
  1. Assuming that the Anticipatable IN [B] and Anticipatable OU T [B] sets are already computed, write a condition for determining legal hoisting points for a set of Very Busy Expressions. Also write a condition for determining maximal hoisting (for maximally covering very busy expressions). Finally, write an algorithm that uses both of these conditions to maximally and legally hoist very busy expressions. The algorithm must reject illegal hoistings.
  2. Show an the example of CFG, illustrating how maximal hoisting is performed and how the algorithm rejects illegal hoistings.

Question 3 [20 points] [2 pages]

Dataflow analysis can be used to detect unsound programs that have uninitialized variables. A use of uninitial- ized variable occurs when a uninitialized variable serves as an operand. We call such uses as unsound uses; since garbage values can reside in uninitialized variables, such uses can have arbitrary values and it is desirable to detect them. An unsound use can propagate such values to other variables’ definitions that are derived from them and this can continue transitively. We call such a set of definitions derived from the unsound uses as unsound definitions. Devise dataflow analysis to detect and warn about all unsound uses and unsound definitions in a program. Please write dataflow equations first followed by an iterative algorithm. You must devise such an analysis from scratch without assuming any other analysis has been done prior to it. Then illustrate it on a CFG. Finally, discuss the complexity of the algorithm in terms of number of basic blocks and propose a way to make it faster on condition that the program has very sparse initialization points of the variables.

Next, consider a branch predicate p(a < b). Due to the unsound usage of a or b, the branch outcome might be wrongly affected here, sending the control flow in the wrong direction. However, even in this situation, not all basic blocks will be affected, i.e., some basic blocks will still execute just fine in terms of control flow (although the dataflow into them would get messed up due to some others not executing properly). Devise an algorithm to detect all such basic blocks whose execution in terms of control flow will not be affected due to unsound uses in the predicate, illustrating it on a sample CFG.

Question 4 [20 points] [2 pages]

It is proposed to perform constant propagation only within the loops and for that matter we would like to write a demand driven analysis for all the uses within the loop. Note that we need to only find the definitions relevant to given uses (in this case uses inside the loop) and drive the analysis accordingly as against reaching definitions analysis in which we focus on all definitions and find where they reach. Assume that loop detection is done but any other analysis (including regular reaching definitions analysis) is NOT done. Please answer the following questions:

  1. Show an example using a diamond graph based loop (4 blocks, one branch, one join and one back edge) and a few non loop basic blocks, which definitions are relevant to such loop-based uses and how they could be found.Note that such definitions might lie outside the loop.
  2. Devise data flow formulation for finding such definitions for loop based uses. Devise a basic block based framework first then extend it across the basic blocks. Write the data-flow equations. Is the analysis forward or backward? What is the operator at join?
  3. Discuss the safety condition and effect of aliases.

Question 5 [20 points] [2 pages]

On the given CFG shown below, perform PRE showing all the steps. Does PRE framework discussed in the class (Drechler and Knoop’s papers) completely remove partial redundancy? Show an example where partial redundancy is not completely removed. Figure is on the next page.