Welcome
My name is Parker Lamb - I'm an undergraduate Physics and Astronomy double-major at the University of Washington.
Here you'll find notes on courses I care to take them for and some projects I'm working on while here. If you have any questions for me email me at aethio@uw.edu.
Chapter 5 - Magnetostatics
Reference "Introduction to Electrodynamics" (5e) by David Griffiths.
Say instead of static electric charges (electrostatics), we start dealing with moving electric charges (electrodynamics).
Moving charges will generate a magnetic field around them. If the rate of moving charges (current ) is constant, we're working with magnetostatics. The -field for some current can be modeled by the right-hand rule:
Magnetic fields interact with one another. Using the Lorentz force law, Or, in the presence of an electric field as well,
Todo: cyclotron motion?
Note: magnetic fields do no work. The objects that generate the -fields do work; but it's an apparently subtle distinction that the textbook does not expand on yet.
Current
Current is some charge per unit time through some cross-sectional area, defined such that .
If we had some line charge of Coulombs traveling at velocity , . In cases where we're not just along a straight line, is a vector and as such current is too: The force on this line of charge is for a constant current .
Surface and Volume Currents
If we have some charges flowing across some on a surface, or across some area on a volume, we represent them with surface current density and volume current density .
Similar to a line charge, if our surface has density (or for a volume charge) and charges move at velocity : For some volume, the charge conservation equation (or continuity equation) says any charge flowing out of a volume means
Note: means change in charge density per change in time, meaning . This is zero in magnetostatics.
Biot-Savart Law
The magnetic field of some steady-state line current is
Note: has units of newtons per ampere-meter (or Tesla): .
where is the "test point" where you'd like to know the magnetic field, points to some (temporary) infinitesimal charge line element , and points from the charge element to the test point.
Note: the integration is along the current path.
is the "permeability of free space":
The B.S. law also works for surface and volume charges: where and , converting to polar or spherical as necessary.
Divergence and Curl of
The curl of any magnetic field is proportional to the current density (or contained current): where is a volume current density. It's also related to total current by taking a surface integral bounding the volume:
This means stronger currents have higher-magnitude -fields, and stronger -fields enclose a higher current / current density within.
The divergence of a magnetic field is always zero.
A magnetic field will circle around a wire but will not expand outward.
Ampère's Law
For some circular magnetic field (circumference ) around a wire, the path integral of it is independent of radius:
The radius of the path integral (circle, circ. ) increases at the same rate as the magnitude of the -field decreases - i.e. it's independent of radius.
This is known as Ampère's Law: The current enclosed by some path integral along the -field is proportional to the enclosed current . It's like the magnetostatics equivalent to Gauss's law as it relates to Coulomb's law.
For a surface current along an infinite plane, the Ampèrean loop might be a rectangle perpendicular to the current.
Above, the components of the Amp. loop aren't aligned with , so we only care about the components - i.e.
Note that Ampère's law only works for
- Infinite straight lines / cylinders
- Infinite planes
- Infinite solenoids
- Toroids (see Griffiths Ex. 5.10)
Magnetic Vector Potential
Magnetic vector potential is the magnetostatics equivalent to electric potential .
. If we instead write this as , Griffiths has a further derivation of this in 5.4.1. Ultimately, we get to the magnetic Poisson's equation equivalent: We can pull out if constant. Use or for surface and volume currents respectively.
The direction of will almost always be the same as that of current.
Dipole Moment of
The explicit multipole expansion of is in Griffiths 5.4.3. I've only included the dipole moment for brevity, and because the monopole moment has no proof of existence & higher-order terms are rarely helpful.
The dipole moment tends to dominate magnetic vector potential multipole expansions (no monopole given ), and is represented as where is the magnetic dipole moment this is independent of origin, since is centered at .
Chapter 6 - Magnetic Fields in Matter
Reference "Introduction to Electrodynamics" (5e) by David Griffiths.
If a material is placed in a -field, it will acquire a magnetic polarization (or magnetization). Different materials acquire different polarizations depending on their atomic structures.
- Paramagnets: magnetization parallel to applied , materials with odd # of electrons.
- Diamagnets: magnetization antiparallel to applied , materials with an even # of electrons.
- Ferromagnets: magnetization persists on the material even after the applied -field is removed, and is determined by the whole "magnetic history" of the object.
All electrons act as magnetic dipoles.
Imagine a magnetic dipole as pointing from south to north (the Gilbert model). It's an inaccurate model at small scales according to Griffiths, but he recommends it for intuition.
Torques and Forces on Magnetic Dipoles
Magnetic dipoles will experience some torque in an applied field, where and is the magnetic dipole moment. For paramagnetic materials (odd electrons), will be roughly in the same direction as the applied -field.
For a current loop, where are side lengths.
In a uniform field the net force on the dipole is zero, though this is not the case for nonuniform fields.
For an infinitesimal loop with dipole moment in field , the force on the loop is
Diamagnets
Diamagnetism affects all materials, but is much weaker than paramagnetism, so is most easily observed in materials with an even number of electrons.
When an external -field is applied to a material, individual electrons will speed up according to where is the "radius" of the electron from the nucleus of an atom. This increase in orbital speed will change the dipole moment this change in the dipole moment is antiparallel to as shown above.
Magnetization
Magnetization is For a paramagnet, perhaps suspended above a solenoid, the magnetization would be positive/upward, and force downward. For a diamagnet, the magnetization would be instead downward, and force upward.
In general in a nonuniform field, paramagnets are attracted into the field, and diamagnets are repelled away.
Note: is an average over a wildly complex set of infinitesimal dipoles and "smooths out" the dipole into a macroscopic view.
Note: both diamagnetism and paramagnetism are quite weak compared to, for instance, ferromagnetism, and so are often neglected in experimental calculations.
Field of a Magnetized Object
For some single dipole, the magnetic vector potential is For a magnetized object with magnetization , Alternatively, we can look at the object in terms of its volume current density and surface current density , where
This means the potential (and therefore magnetic field) is the same as would be made by some volume current throughout the material plus the surface current on the boundary.
This means we needn't integrate all the infinitesimal dipoles, but rather just determine the bound currents and and find the field they produce.
Note:
Chapter 1 - Relevant Mathematics
Reference "Introduction to Electrodynamics" by David Griffiths.
Vectors
We start with a brief review of vector algebra - I'll skip most of it but keep the key things.
Dot product
The dot product is a measure of how "parallel" two vectors are, maximized when parallel, minimized (0) when perpendicular.
It is commutative () and distributive (). The result of the dot product is a scalar.
Cross product
The cross product yields a third vector orthogonal to both and , maximized in magnitude when and are themselves orthogonal to one another.
The cross product is distributive, but not exactly commutative - instead, Some other properties:
- The result of the cross product is a vector, not a scalar.
- The magnitude of the cross product is the area of the parallelogram generated by and .
- The cross product of a vector with itself is zero.
We can calculate a cross product like so:
where is the normal to the plane formed by and .
Vector triple product
Triple products are combinations of cross and dot products.
-
Scalar triple products: the magnitude of this is the volume of the parallelepiped (3D parallelogram) generated by , and .
-
Vector triple products: there's no easy geometric interpretation of this, but is useful to reduce complex cross product calculations. It can be memorized by the mnemonic BAC-CAB.
Spatial vectors
The position vector indicates the position of a point relative to the origin.
In electrodynamics, usually we'll have two points: one for the source charge and one for the test charge. The vector from one to the other is the useful quantity then.
The Del Operator
Del () is a vector operator that acts upon functions, and is defined as really only is meaningful when applied to functions, and has three (usual) ways of being applied:
- To a scalar function () to get the gradient of that function.
- To a vector function via the dot product () to get that function's divergence.
- To a vector function via the cross product () to get that function's curl.
Gradients
The gradient of some scalar function gives a vector result that will point in the direction of the max rate of increase of that function.
and results in a vector with the derivative of the original function's units.
Fundamental theorem for gradients
Important: The integral of a derivative (here, the gradient) is given by the value of the function at the boundaries and .
Divergence
The divergence of some vector function represents how much the function "spreads out" or diverges from a given point. and results in a dimensionless scalar value indicating the rate of divergence from a point.
Green's theorem
Green's theorem is a special application of Stoke's theorem (see below). Geometrically, it relates the sources contained in some surface to the flux emitted by those sources around the boundary of a surface. In the case of electrostatics, this might be the flux field generated by some charges in a volume to the total, superposition flux coming out of that entire region - turning one "field of charges" into a single charge.
The integral of a derivative over some region will always be equal to the value of the function at the boundary () - in this case, the boundary term is an integral (a simple line may just have two endpoints, but a line bounding a volume forms a closed sufrface).
Curl
The curl of some vector function represents how much the function "swirls around" some given point. Positive curl is given by the right-hand rule (normally counterclockwise).
Stokes' Theorem
The integral of the curl over some surface is the "total amount of swirl" on that surface - and mathematically can be distilled into just finding how much the flow is following the boundary of that surface. This last quantity, is sometimes called the "circulation" of some vector field .
A Note on Del Squared ()
Enumerating over the ways we can apply twice:
- Laplacian: the cross product of with the gradient of a function results in the Laplacian, such that
- Curl of gradient: always zero.
- Gradient of divergence: not a lot of physical applications, calculable though.
- Divergence of curl: always zero.
- Curl of curl: most easily defined as the acceleration of the swirl, like a hurricane speeding up.
Integrals
In electrodynamics, we have line (path) integrals, surface integrals (or flux) and volume integrals.
Line integrals
Iterate over infinitesimally small displacement vectors . If , the line integral iterates over a closed loop and we can write
In elementary physics, a common example is work done by a force.
While path taken normally matters (distance), there are some vectors where only displacement matters (for forces, these are called conservative forces, like gravity).
Wikipedia has a really cool animation of line integrals:
Surface integrals
Surface integrals iterate over infinitesimally small areas , with norms perpendicular to the surface. In this case, if the surface is closed (i.e. like a sphere, rather than a hill), then we can write it in closed loop form.
Surface integrals are really useful for things like flux where something is moving through some area - i.e. imagine radio waves passing through a curved gas cloud or radiation falling "through" a planet.
Volume integrals
Volume integrals integrate over infinitesimally small volumes .
If some function represented density of a substance, then the volume integral over it would be total mass.
Integration by Parts
Also known as the "inverse product rule".
Curvilinear Coordinates
Taking another look at spherical and cylindrical coordinates, as well as how to convert between them.
Spherical coordinates
Three terms:
- : the distance from the origin (range 0 to )
- : the polar angle from the -axis (range 0 to )
- : the azimuthal angle along the -axis (range 0 to )
Alternatively in terms of the unit vectors:
The interesting thing about the matrix form is that it is orthogonal - according to the invertible matrix theorem, for orthogonal matrices.
We can use this to solve for , and easily then by just taking the transpose of the above matrix.
Our line element in spherical coordinates is
In terms of triple integrals, the textbook has a good graphic to represent the three displacements .
And a infinitesimal volume element is the product of these three:
has a possible range from 0 to , from 0 to and from 0 to .
The textbook represents the spherical representations of the gradient, divergence, curl and Laplacian in eqs. 1.70-1.73.
Cylindrical coordinates
Three terms:
- : distance from -axis
- : azimuthal angle along -axis
- : height along z-axis
with unit vectors
For triple integrals, the infinitesimal displacements are our volume element is and line displacement is
Dirac Delta Function
Imagine we have an infinite point mass at the origin, and it is the only thing in the universe.
The total mass of the universe must be just the mass of the point charge; but the point charge exists only at , and is a point charge - it has no spatial dimension. If we were to integrate over the entire universe, we'd find we have just the mass of the point charge - and can mathematically represent that with the Dirac delta function.
and Therefore, if we have some continuous function (could be a constant or a function, anything attached to ). which brings us to the neat and nice identity:
Dirac Delta in 2D, 3D
And, unsurprisingly,
Note on : with (the separation vector), and Reference the textbook equations 1.100-1.102 for derivation.
Chapter 2 - Electrostatics
Reference "Introduction to Electrodynamics" by David Griffiths.
Fundamentally, electrodynamics seeks to model the interaction of some set of charges on some other charge, . We can do this via superposition, which states the interaction between two charges is unaffected by the presence of others - i.e. , the force between and , is independent of the presence of and . We can then sum up the forces to get our net force on : In an electrostatic system (no moving charges as opposed to an electrodynamic system), the forces can be calculated via Coulomb's law.
Coulomb's Law
Electrostatics simply takes into account distance and charge strength. Coulomb's law is
is the permittivity of free space ( and (Griffith's script-r).
in electrostatics, force increases with higher-magnitude charges, and decreases with distance.
If we have several charges, we just sum their individual Coulomb forces as calculated above:
In shorthand, where , the electric field, is where .
The electric field is the force per unit charge - this force can only be imparted by the presence of other charges.
Continuous Charge Distributions
If we have a set of charges , then our electric field is generated from a discrete charge distribution. If, however, the charge is distributed continuously over some region, then we have a continuous charge distribution generating the -field.
The electric field calculated as an integral over that continuous distribution: where varies for our continuous distribution type (1D, 2D or 3D), such that:
- Line:
- Area:
- Volume:
Divergence and Curl of Electrostatic Fields
Griffiths 2.2.
Field lines are a way to visualize electric fields on a 2D or 3D medium.
where the density of field lines indicates the strength of the E-field at that location (strongest near the center in the above image). The flux of through some surface is and is a way to measure the total charge contained within a closed surface, while those outside of the surface don't affect it.
This mathematically represented by Gauss's law, which states There's also a differential version of Gauss's law (found by applying the Divergence Theorem) where is charge density:
For example: in 3D, if a point charge is at the origin then the flux of through a sphere around it is
We're only evaluating at the surface of the sphere, so is constant.
Note that Gauss's law is only useful with objects where the E-field is pointing in the same direction as elements . This requires both a symmetrical charge distribution within the object, and for the Gaussian surface to be symmetric according to one of the following:
- Spherically symmetric (concentric spheres)
- Cylindrically symmetric (coaxial cylinders)
- Plane symmetric (like a "box" that is cut in half by the plane)
The associated coordinate system is used for each.
Note: flux from external charges is ignored, since the flux entering a Gaussian surface from an external charge will be equal to the flux leaving that surface from the other side - hence, net flux from those external charges is zero.
If the object we're evaluating doesn't have perpendicular surface elements (i.e. a charge at the north pole of a sphere instead of the center), then some problems can be approached by creating another Gaussian element centered over the charge which subtends the same angle, though this can get geometrically complicated.
Divergence of
Since , then this just becomes Or, in shortform,
Curl of
For any electrostatic charge distribution, the curl of the field is always zero.
Electric Potential
Griffiths 2.3.
Electric potential is a good indicator of the "strength" of an electric field between two points, and is defined as with units .
A "natural" origin is a point infinitely far from the charge, though only when the charge distribution itself doesn't also extend to infinity.
in differential form, it's the reverse: Voltage is path-independent and only displacements matter - hence the electric potential between points and is
Poisson & Laplace Equations
Poisson's equation says If there is no charge (), Poisson's equation turns into Laplace's equation:
Voltage equations
For a point charge
is the reference point from origin, from the point charge to the reference point .
For a set of point charges, we can use the principle of superposition:
For a continuous distribution, and for a volume (to compute when we know ):
from origin to charge.
Note: this is similar to the formula for electric field, but is missing - is a scalar quantity, directionless.
Summary
The three "fundamentals" of electrostatics are , and , and are related by the following:
Griffith's Fig. 2.35.
Work and Energy
Griffith's 2.4.
For a conservative force, work is force times displacement. In electrostatics, our equivalent version is
More extended,
where work here is the work it takes, per unit charge, to carry a charge from to . If our reference point and we let , then
Work in a Point Charge Distribution
Imagine we had some charge - then brought in a new charge to join . Our work done would be
It would cost Then, to bring in a third charge would act against the superposition of and : ... etc. To assemble charges, we'd need a work of
Work in a Continuous Charge Distribution
For a continuous charge distribution, we use our last equation for our point charge distribution system and replace the sum with an integral . With a bit of mathematics, this is equivalent to
Conductors
Griffith's 2.5.
Conductors are materials in which electrons are free to roam (like cows, on an open ranch 👍). In contrast, insulators have electrons pretty much immobile and packed-together (like Monsanto farms 👎).
We can approximate metals as ideal-case conductors (though perfect conductors don't yet exist, though we're gradually coming closer) with the following attributes:
- inside a conductor - or, the induced field will cancel the external field. If an external field is applied to a conductor, free electrons will move toward the E-field until they all sit on the surface, creating a deficit of charges on the opposite surface and positively charging it. Crucially, the net E-field is zero, so the fields must cancel inside: Charge will continue to flow until the cancellation is complete. Outside the conductor, , since the two fields don't tend to cancel.
- Charge density is zero : Since Gauss's law says , zero E-field means zero charge density in the conductor.
- Net charge resides on surface: positive and negative charges will only sit on the surface after enough time passes.
- is perpendicular to the surface: somewhat obvious, but bears mentioning.
Any dynamical system will try to minimize potential energy - the charges residing on the surface are an extension of this. It might take some time, but will eventually happen.
Induced Charges
If we hold a charge near a conductor, the conductor will move toward the charge - this is because negative charges will accumulate closer to than the "effective" charges on the far side. Force falls off by , so the conductor will be attracted to the charge.
Cavities
Let's say we have a cavity inside our conductive surface. There are two scenarios here:
Empty cavity: If the cavity has no charge, the field within the cavity is zero, regardless of the external fields applied. This is the principle of the Faraday cage.
Non-empty cavity: The charge contained by the cavity will induce an opposite charge uniformly distributed on the walls of the cavity - the only information transmitted to an external observer is the distribution of charge on the exterior wall (i.e. the magnitude of the internal E-field, or the amount of net charge contained).
Surface Charge and Force on a Conductor
Through the field inside a conductor is zero, the field immediately outside is
If we only care about the magnitude of the E-field, then .
However, the electric field is discontinuous at a surface charge, so when calculating the force per unit area (or pressure) of an E-field at the surface of a conductor, we average the E-fields above and below the surface, such that This is the outward electrostatic pressure on the surface, tending to draw the conductor into a given field, regardless of the sign of (squared away).
The pressure at this point, expressed in terms of the field just outside the surface, is
Capacitors
Put two conductors beside one another, with equal and opposite uniform net charges on one and on the other:
Since the charge density is uniform on each surface, (on the left plate) and the field between the two is with a voltage of
in a uniform E-field.
We'll define a new term to represent the proportionality of the arrangement, capacitance: with units of Farads , usually expressed in or . The work to go from one side to the other is
Chapter 3 - Potentials
Reference "Introduction to Electrodynamics" by David Griffiths.
The goal of electrostatics is to find the electric field given some stationary, immobile charge distribution.
Coulomb's Law is the way we do this for simple charge configurations, but for more complex charge configurations it's often easier to work with potential . For areas of nonzero charge density such as point, surface or volume charges, we use Poisson's equation: Outside of these charge regons (such as in regular space), this reduces to Laplace's equation: with solutions called harmonic functions.
Laplace's equation is fundamental to the study of electrostatics according to Griffiths.
Laplace's Equation
In three dimensions, visualization is challenging - but the same two properties apply, with the first this time being
- The value of is the average over a spherical surface of radius centered at that point, with
- can have no local maxima or minima, with extreme values only permitted at the boundaries (i.e. surface of the sphere).
Point 2 is particularly relevant in each of these circumstances - the value of at some point is the average of the surrounding on some surrounding boundary.
Uniqueness Theorems
The proof that a proposed set of boundary conditions will suffice takes the form of a uniqueness theorem (alternatively, a criterion to determine whether a solution to the Laplace or Poisson equations is unique).
Uniqueness Theorem 1: the solution to Laplace's equation in a volume is unique if is specified on a boundary surface enclosing the volume .
The surface does not have to be Gaussian - it can look totally eldritch and crazy and the theorem would hold.
Uniqueness Theorem 2: given a volume surrounded by conductors and containing some charge density , the electric field is uniquely determined if the total charge on each conductor is given.
Method of Images
Griffiths 3.2.
Say we have some charge a distance above an infinite grounded plane; what is the potential in the region above the plane? Our boundary conditions state
We could solve Poisson's equation for this region, but a much easier technique is to use the Method of Images. Wikipedia says on the subject:
The method of image charges is used in electrostatics to simply calculate or visualize the distribution of the electric field of a charge in the vicinity of a conducting surface.
It is based on the fact that the tangential component of the electrical field on the surface of a conductor is zero, and that an electric field E in some region is uniquely defined by its normal component over the surface that confines this region (the uniqueness theorem).
Start by removing the conductor, and placing an opposite charge at :
Then, the potential is easy to calculate: which obeys both of the boundary conditions of the original problem:
- when
- for
Thus by uniqueness theorem 1, is the solution to our original problem.
Uniqueness theorem 1 means that if a solution satisfies Poisson's equation in the region of interest and assumes the correct value at the boundaries, it must be right.
The Method of Images can be used in any scenario where we have a stationary charge distribution near a grounded conducting plane.
Induced Surface Charges
The surface charge density induced on a conductor is where is the normal derivative of at the surface. In the above case, this is is in the direction - if we take this partial derivative of our above calculated voltage, then
As expected from a positive charge, the induced surface charge is negative and greatest at .
The total induced charge is Yes!
Force & Energy
Since the induced charge on the conductor is and our charge is , it is attracted to the plane with a force given by Coulomb's Law: While force is the same in our mirror problem, energy is not. For two point charges, , such that But for a single charge and conducting plane (continuous charge distribution), energy is half of this.
The work to bring two point charges towards one another does work on both of them, while to bring a point charge toward a grounded conductor has us only doing work on one charge - only half the work is necessary.
Separation of Variables
Griffiths' 3.3
Separation of variables is a way to solve ODEs and PDEs by rewriting equations such that each of the two variables occur on different sides of the equation.
Separable equations must be able to be written in the form We can rearrange the terms to get integrate, and add some constant term to one side to represent all our constants of integration.
In the context of electrostatics, separation of variables is very useful when solving 2D Laplace equations, such as We need solutions in the form of This can be accomplished through some mathematical trickery to find our separated variables ... ... which is of the form . Thus, both and must be constant (we can't hold one constant and change the other with this solution still holding).
So, Converting each equation into an ODE, ... we converted a PDE into two ODEs, which are much easier to solve. Our solutions will be a constant coefficient set: We can find our constants based on our boundary conditions now.
Multipole Expansion
Griffiths 3.4
Say you have some charge that you can see in space, far away. is almost like an exoplanet in a sense: we know it exists, we know its total mass, and maybe we know stuff like how far away it is.
But, we don't know what the surface looks like; what it's composed of, be it rock or ice or water. Just its net mass.
Similarly, our charge distribution might be crazy complicated, and it's net charge only describes a tiny part of the story, albeit an important one - the multipole expansion is used to describe this fuller story, and is defined in terms of voltage as
Mathematically, it is where is the angle between and , is the reference point (from origin) and the charge (from origin).
Monopole and dipole terms
At large , a charge distribution just looks like a net charge (like an exoplanet). Thus, the monopole moment is just the net charge: The dipole moment describes the individual distribution of charges:
points from the origin to some charge distribution. is from the origin to some reference point.
Origin of coordinates
Changing the origin will never change the monopole moment , but will change the dipole moment as long as the total charge .
For example: if our system has and as its point distribution, and the dipole moment is origin-independent.
Electric field of moments: the electric field is defined as .
To find the electric field i.e. caused by the dipole moment, find the voltage term for that dipole moment, then take the negative gradient of it.
Chapter 1 - Stern-Gerlach Experiments
Reference Quantum Mechanics: A Paradigms Approach by David McIntyre.
First conceptualized in 1922 by Otto Stern and later performed by Walther Gerlach, the SG experiment involved sending a beam of silver atoms through a nonuniform magnetic field and observing their distribution.
Classically, the distribution ought to look like the input beam - since silver atoms are electrically neutral, the nonuniform magnetic field shouldn't affect them, and they should just pass through.
Experimentally, we see the silver beam split at the magnetic field into roughly equal-sized groups, indicating not only that the magnetic field interacts with the silver atoms, but that each electrically-neutral silver atom must have some additional property that interacts with a -field (i.e. that the -field puts some force on the silver atoms based on some binary property).
We know (ref: Wikipedia entry on magnetic fields), where is the magnetic moment and is the magnetic field strength. Classically, should be zero for a neutral atom (all atoms hit the center) - however, we see that there must be forces on the atom since our atoms diverge from the beam at the -field. Further, we have an "upper" and "lower" group with equal distances - so . With known to be constant, then by the results of the experiment we must have two values for : and (the magnetic moment values are quantized).
Enter: Electron Spin
If we imagine (classically) an electron as a charge moving around a loop of current,
then . Here, and , so Thus, (classically), we can imagine each charge as having some "orbital angular momentum" (like planets around a star). Angular momentum is , so we can rewrite this as Experimentally, we observe that, while normal orbital angular momentum of charged particles still "exists", it's not the whole picture - we also need "spin", where we can write as where is a dimensionless "gyroscopic ratio" (note: different from planet spin since electron is almost a 1d point). For an electron in the direction, and, to achieve the results we observe in the experiment, we can have only two values of : where is a modified Planck's constant . This binary value represents a spin-1/2 system, though these aren't the only two possible values.
Quantum States
Postulate 1: the state of a quantum mechanical system includes all information we can know about it. Mathematically, we represent this state by a "ket", .
In the spin-1/2 system,
where is the quantum state of the atoms that are spin-up , and is the quantum state of the spin-down atoms .
Stern-Gerlach ran a few experiments using this basic setup to get a better understanding of the quantum nature of spin, and explore this divergence from what is classically expected.
-
Experiment 1: Analyzing spin in twice in a row. Spin values were conserved between measurements - if we only take atoms and measure again, we only see atoms.
-
Experiment 2: Analyze spin in , then in - was found to be totally independent whether we used or . Complicated, though - this is a mixed state, rather than the superposition of experiment 1. We'll investigate options to make it nicer later.
-
Experiment 3: Analyze spin in , then , then . We expected to see the spin values conserved - but don't! Instead, by measuring , we find we "reset" the spin of .
-
Experiment 4: Analyze spin in , then , then - however, instead of measuring , just send both outputs into the next . Somehow, by not "measuring" , we don't reset the spin measurement.
Bra-ket notation
Used to represent quantum state vectors, which lie in the Hilbert vector space. The dimension of the Hilbert space is determined by the current system - in our above example, we have only two possible results, so represents a complete basis with dimensionality 2.
Note: and are complex scalar multiples. Some properties of the bra-ket notation (Dirac's first and only pun):
- Each ket has a corresponding bra, such that for some state ,
- Multiplying a bra with a ket represents an inner (dot) product.
This means we can multiply with to get each constant (i.e. here), such that
Likewise, and .
- All quantum state vectors must be normalized, such that
If we wanted to normalize some vector , apply some normalization constant , such that and solve for .
Note: will be an absval by the end here - but we don't care about it's phase (not physically meaningful), so just make it real and positive.
- The complex constants and when squared (i.e. ) represent probabilities for a given measurement (probability for or above). The normalization property implies the probabilities must sum up to 1 - helping prove postulate 1.
Matrix form
We can also represent states in matrix form, where and with the corresponding bra represented by the row vector so
General quantum systems
For some general quantum system, where we might not only have 2 results (i.e. not only spin-1/2), such as
then, generally speaking,
Note: represents each corresponding complex scalar multiple, and is the Kronecker delta, which is 1 if and 0 if .
Converting kets between axes
We can represent up-down spin measurements in and in terms of our -spin kets, such that
In matrix form,
The Postulates of Quantum Mechanics
Postulate 1: the state of a quantum mechanical system, , contains everything we can mathematically know about it - experimentally, the hidden variables theory (regardless of validity) is irrelevant (since the variables are by definition hidden.)
Postulate 2: Physical observables (such as spin) are represented mathematically by some operator that operates on kets.
Postulate 3: The only results possible from a measurement are contained in the set of eigenvalues of the operator .
Postulate 4: The possibility of obtaining some eigenvalue in a measurement of the observable on the system is g
Postulate 5: After a measurement of (such as spin) that yields some result (such as ), the quantum system is in a new state that is a normalized projection of the original system ket onto the ket (or kets) corresponding to the result of the measurement
Postulate 6: The time evolution of a quantum system is determined by the Hamiltonian, or total energy operator via the Schrödinger equation
Note: only postulates 1 and 2 were covered in this chapter.
Chapter 2 - Operators & Measurement
Reference Quantum Mechanics: A Paradigms Approach by David McIntyre.
Operators are mathematical objects that operate on kets to turn them into new kets; is one such example. If a ket is not changed by the application of some operator, then the ket is known as an eigenvector and the associated constants are eigenvalues. Above, is the eigenvector and is an eigenvalue. Both eigenvectors and eigenvalues are properties of the operator - if it changes, so does the eigenstuff. For the operator , the eigenvectors / values are:
Postulate 3: The only possible result of a measurement of an observable (like ) is one of the eigenvalues of the corresponding operator .
This postulate implies that, if we have the eigenvectors and values of , we can actually reconstruct from them.
Let . Which leads us to Now, plugging these solved values in,
- Operators are diagonal in their own basis (i.e. they only have elements along the main diagonal).
- The elements along the diagonal are the eigenvalues of the operator.
- The eigenvectors themselves are not included in the operator matrix.
Note: an observable is some physical quantity that can be measured - such as spin. In that context, the operator represents that observable.
Other representations of matrix elements
We can also represent an operator matrix in terms of its elements. Let the operator describe some two-dimensional spin-1/2 system (like does).
Where
Finding eigenvectors / values from an observable
If we know the operator , and need to find the possible results of a measurement of the corresponding observable, we can start from the general eigenvalue equation and work from there.
where is the eigenvalue and is the corresponding eigenvector. Eigenvalues can be found then by solving the secular equation, Note: for a operator (i.e. a spin-1/2 system), this is .
After finding the eigenvalues, we now know and - all that remains is to find by solving for it in .
The process of finding the eigenvectors and eigenvalues of a matrix is known as diagonalization.
Hermitian operators
To find the associated bra operator to some operator acting on a ket, we use the operator
where is the Hermitian adjoint of , found by transposing and taking the complex conjugate of . If , then is said to be Hermitian, and for .
This is not normally the case - if is not Hermitian, the corresponding bra-state operator will be different from - i.e. will not be the bra corresponding to .
Hermitian operators have some properties to be aware of:
- They will always have real eigenvalues which ensures the results from a measurement are always real.
- The eigenvectors of a Hermitian operator form a complete set of basis states, ensuring we can use the eigenvectors of any observable as a valid basis.
New operators
Let's say we wanted to find the spin component in some general direction .
where . Projecting the spin vector onto such that then and diagonalizing the vector above, we find the eigenvectors for to be with the state vector
Note: the coefficient calculation attached to is from total probability being equal to 1.
Outer product
Instead of taking the inner product between two state vectors, we can also find the outer product (). For example, we can rewrite as and are called projection operators:
This last formula is known as a closure or completeness relation due to the equivalency to the identity operator , meaning that these basis states form a complete set of states.
When our projection operator acts on some state ,
Note: is equivalently the or coefficients.
and the probability of observing some state is then This allows us to write the fifth postulate
Postulate 5: After a measurement of some observable operator that yields the result , the quantum system is in a new state that is the normalized projection of the original system ket onto the new ket corresponding to the result of the measurement
Returning to Stern-Gerlach 3 & 4
We were able to look at experiments 1 and partially 2 in chapter 1, but not experiments 3 and 4, which required expounding on operator products a bit further.
Experiment 3
Experiment 3 measured , then measured , then again, observing a "reset" with the other measurement.
The probability that an atom stays throughout the three measurements is the product of each probability - such that
Likewise for it to be , then , then is
Tradition holds QM magnitudes / probabilities to be read from right to left.
Experiment 4
Experiment 4 measured , then put the atoms through an analyzer (but didn't measure them), then measured again, finding the "measurement" step to be an important one.
The results in this one are interesting but simple - representing them with postulate 5,
This can be expanded out (and is done so in McIntyre 2.2.4), but will not be included here for the sake of brevity.
Mean and Standard Deviation
The expected mean (or expectation value) of some measurement is represented by , such that and is a sum of each possible result multiplied by the probability of that result.
Probability can be represented by both outer products and inner products - inner products are more readable, but require squared absvals , while outer products require more terms overall.
We can also write the expectation value for some operator as For , this can be written out as
Which makes sense, given is the only possible result of a measurement of for the state. For some system prepared .
Standard Deviation
The root-mean-square deviation (r.m.s. deviation) is
In the rightmost version, the second term is the squared expectation value. The first term, in the case of the initial state, can be expanded into and Expected, since we only have one possible result for measuring - so no spread of possible results.
Commuting Observables
Note: eigenstates are equivalent to eigenvectors.
The commutator for two operators is defined as If the commutator is zero, then the two operators (observables) commute. If nonzero, they don't commute. Logical consequences can be determined by just this statement alone - the first being: The order of operation doesn't matter for commuting observables. Now, let be an eigenstate (eigenvector) for some operator with eigenvalue : Let's apply to both sides of this. Using the commutability of and , This last equation says is also an eigenvector of corresponding to the same eigenvalue . Thus, must be a scalar multiple of (let's say this scalar multiple is ) - and we can write Commuting operators will therefore share common eigenvectors - this means the two operators (that represent observables) are compatible, and we can measure one without erasing our knowledge of the results of the other observable.
This is often phrased as knowing the results of these observables simultaneously, though realistically we're still measuring each sequentially.
If two operators do not commute, then they are incompatible. This is the case for our orthogonal spin operators - and the commutations for , and are
The Uncertainty Principle
We can relate the product of two standard deviations (i.e. uncertainties) of two observables with the commutator by
The Uncertainty Principle: the product of two uncertainties will always be greater than or equal to the absolute expected value of the two.
For the and spin components,
Implying and - while we can know one spin component, we can never know the other two (they are incompatible observables).
Operator
and is represented in matrix form as
Thus, since the operator is proportional to the identity operator, it must therefore commute with all the other operators and - this implies all states are eigenstates of the operator, and we can write for any state in the spin-1/2 system. The vector has an expectation value of
and a "length" of
This is longer than the "normal" measured component of along any axis - implying the spin vector can never be fully aligned along any one axis, since there will be always components in other axes.
This is often called "quantum fuzziness".
Regarding Photons
Photons also have spin - but due to moving at (ultra-relativistic), it can never have a spin of 0 - must be either .
Spin-1 Systems
Fermions are particles with spins of multiples of 1/2 (i.e. , , ), while bosons are those with full-integer spins (i.e. , , ). Spin-1 bosons have spin states of , or
where
The eigenvalues , and are on the main diagonal of .
For ,
and for ,
Note: The Stern-Gerlach experiments have conceptually the same results - but note that experiment 2 differs (not all 1/3 - rather, one is 1/2, the other two 1/4).
General Quantum Systems
Let denote the spin of some -spin system with number of beams (i.e. for a spin-1 system, this would be 3, spin-1/2 system, 2). Let the possible values for spin on the axis be labeled by - then,
is known as the spin (angular momentum) quantum number and is the spin component quantum number (or magnetic quantum number).
In spin-1/2, and in spin-1,
Chapter 3 - Schrödinger Time Evolution
Reference Quantum Mechanics: A Paradigms Approach by David McIntyre.
The time evolution of a quantum system is governed by the differential equation
where corresponds to the total energy of the system (the Hamiltonian operator, different from the Hermitian operator - though the Hamiltonian is still a Hermitian operator).
Postulate 6 The time evolution of a quantum system is determined by the Hamiltonian (total energy operator) , through the Schrödinger equation:
The eigenvalues of the Hamiltonian are the allowed energies of the quantum system, and the eigenvectors (eigenstates) are the energy eigenvectors of the system, such that
The Energy Basis
Let's say we've already diagonalized and found values for the allowed energies and . General state vectors can be written in terms of these energy eigenstates: and, since the energy eigenvectors are orthonormal, we can write the energy basis as Assuming the Hamiltonian in this context is time-independent, such that for all , then the time-dependent Schrödinger equation can be written where , an angular frequency.
Stationary States
Let's start with the simplest possible situation, where the quantum system is in one single energy eigenstate: After some time , this system will be in the state where . This state, however, differs from our initial state only by the phase factor , and since phase changes will never affect the probability of measurements, the probability of observing some eigenvalue for an observable will be This probability is time-independent and equal to the probability at . There is no measurable time evolution for this state, and the energy eigenstates are called stationary states - if a system begins in some energy eigenstate, it will remain in that state.
This same idea goes for multiple energy eigenstates: where
Non-commuting Observables
If an observable commutes with , then and have common eigenstates - so measuring is equivalent to measuring . However, if does not commute with , then the two observables will not share common eigenstates, and the eigenvalues of will be some superposition of energy eigenstates.
Let the eigenstates of corresponding to be represented by and the probability of measuring would be The overall phase drops out - only the relative phase between and remains. which is the also Bohr frequency.
Summary
In a time-dependent quantum system with a time independent Hamiltonian , the probability of measuring of some observable at time can be found through the following process:
- Diagonalize to find the eigenvalues and eigenvectors .
- Write in terms of the energy eigenstates .
- Multiply each eigenvector coefficient by to get for some arbitrary .
- Calculate the probability .
Time Evolution of Spin
Let's apply Schrödinger time evolution to the spin-1/2 system.
The Hamiltonian operator represents the total energy of a system. In time-dependent solutions, only energy differences are important - thus, the Hamiltonian is just the magnetic potential energy of a spin system If we say the gyromagnetic ratio and (an electron), then
-direction
In the direction, , and
where , and
with eigenstuff Applying the Schrödinger time evolution to a general state in direction with phase shift in : If we were to calculate the probability of measuring along the axis for this arbitrary direction and phase ,
Probability here is time-independent because the eigenstates are also energy eigenstates for this problem (i.e. and commute).
-direction
The probability of measuring spin along the direction instead, where instead of we have is:
Probability in the axis is time-dependent because the operator does not commute with .
For -direction, we'd use the same equation as above, except with instead.
Expectation values
Irrespective of the three axis is the total spin vector's expectation value , the precession of which is known as Larmor precession, with frequency known as the Larmor frequency. The expectation value of the spin vector precesses in a uniform magnetic field, as visualized below.
Ehrenfest's theorem states that quantum mechanical expectation values obey classical laws - as visualized above, the precession of the spin vector makes it clear the system has nonzero angular momentum (as opposed to just a magnetic dipole moment).
Magnetic field in a general direction
Let's say we have a magnetic field with both an and component, such that This field will be oriented in the -plane with some angle w.r.t the -axis. First, we'll define Larmor frequencies with each field component: And the Hamiltonian is then defined as This Hamiltonian isn't diagonal - so we need to diagonalize it. This leads to
When , the energy eigenvalues are just , same as .
From the above figure, we know and where is the spin operator in some direction , with vectors (found in chapter 2)
Spin flip
The probability that some initial state later evolves into (known as spin flip) is
This is called Rabi's formula - see the Rabi cycle.
Neutrino Oscillations
Neutrinos are produced via decay processes, such as where is an electron neutrino and is an electron antineutrino. Neutrinos interact with normal matter via the weak nuclear force (the weakest of the four fundamental forces), making them extremely hard to detect. Other flavors of neutrinos are produced via processes such as the pion-muon decay and muon-electron decay: where is a muon electron.
Electrons, muons and tau () particles, as well as their neutrino byproducts, are known as leptons.
There is no theoretical basis for conservation of lepton flavors, so reactions of the type are possible, where an electron neutrino changes flavor to a muon neutrino or visa versa - these reactions are known as neutrino mixing or neutrino oscillations.
When neutrinos are interacting with the weak nuclear force, they have some energy within that interaction, and the quantum states and are eigenstates of the Hamiltonian describing the weak nuclear force - in free space, with no weak nuclear force present, the only relevant Hamiltonian (energy description operator) is the relativistic energy of the particle (which due to includes the rest masses and momenta) - this relativistic Hamiltonian has mass eigenstates.
If , then the mass eigenstates do not coincide with the weak interaction eigenstates - allowing for flavor-changing processes in the first place (somehow?).
Let where is the mixing angle between the two. If this angle is small, then the small angle hypothesis applies and
Example
Assume an electron is created by a weak-interaction process and propagates through free space to a detector. We detect it by seeing the signature of flavor mixing, rather than the neutrino itself. Let our initial neutrino be an electron neutrino: Since this neutrino's now propagating through free space, the energy eigenstates are just its mass eigenstates (no weak interaction) - so the Schrödinger time evolution is where represents the relativistic energy eigenvalues, determined by the rest mass and momenta of each. Assuming the neutrinos are highly relativistic, then and The probability of the neutrino mixing is The energy difference is just proportional to the difference in masses (see above), and, with where is the distance from the source from the detector and momentum : The probability oscillates from 0 to a maximum value of ... hence the term neutrino oscillation.
Experimental measurements have this mass difference is approximately and .
Time-dependent Hamiltonians
Thus far, we've only looked at time-independent Hamiltonians. Time-dependent Hamiltonians are to them what a salmon in the ocean is to nigiri - complicated.
Generalizing Rabi's formula with the change , where is the frequency of incident light (i.e. incident light is not static, but oscillates with time). For the case of (zero-frequency light), the applied field is static and we can use our earlier definition of Rabi's equation.
The static field case () is referred to as spin procession, the rotating field case () as Rabi flopping. In the static field case, the spin precession is a natural oscillation of a quantum system operating in a superposition of energy eigenstates. In the rotating field case, the Rabi flopping represents transitions between energy eigenstates, and there is an exchange of energy between the particle system and the applied rotating field - since the Hamiltonian is time-dependent, energy is not static within the system, but moving into and out of the applied field.
Time-dependent Hamiltonian
Let's look at the scenario where some incident photons collide with some matter, increasing the energy levels of local electrons.
In a quantum sense, the oscillating electric field of the light wave interacts with the electric dipole of the atom, and the energy exchange between the field and the atom corresponds to the absorption and emission of photons.
The probability of a Rabi spin flip changes/oscillates with the angular frequency (typically referred to as the generalized Rabi frequency - though the Rabi frequency itself usually refers to .
The Hamiltonian of this system is
Resonant frequency
If , the frequency of the rotating field is equal to the Larmor precession frequency , and the probability of a spin flip just becomes
Spin is flipped with 100% probability with some frequency .
Non-resonant frequency
At values , the spin flip oscillates with an amplitude smaller than 1.
The FWHM of the above curve is .
Chapter 4 - Quantum Spookiness
Reference Quantum Mechanics: A Paradigms Approach by David McIntyre.
In the 20th century, as quantum mechanics was first developing widespread acknowledgement, various theories were put forth to try and explain the probabilistic (rather than deterministic) nature that surprised a lot of classical physicists.
Einstein-Podolsky-Rosen Paradox (EPR)
Some, such as Albert "God does not play dice" Einstein, thought the probabilistic nature of QM (especially when juxtaposed with the more-deterministic classical mechanics) was just because we weren't seeing some variables - we thought it probabilistic only because there was some backstage action we weren't seeing (an incomplete description of reality).
The experiment begins with an unstable particle with spin 0, which then decays into two spin particles (consv. of angular momentum) traveling in opposite directions (consv. of linear momentum).
Because each spin is opposite the other, if observer 1 sees spin +1/2, observer 2 must see spin -1/2, and visa versa. After measurement, each observer always knows what the other observer sees.
This would be later called an entangled state.
There are two ways of looking at these particles:
- Each exists as part of a state that is in superposition, so neither particle has a defined spin until measurement.
- Both particles have spin values defined at the moment of decay - particle 1 always had spin , particle 2 always spin .
For example, imagine a particle decays on Earth, and the resulting 1/2 particles are kept (unmeasured) in two chambers (i.e. the quantum state describing both is in superposition by view 1). One 1/2 particle is sent with a ship to Mars, and another is sent on a ship to Alpha Centauri.
If Mars makes the measurement , according to idea 1, the state instantaneously collapses into the state and Alpha Centauri will measure , regardless of the distance between the two. Idea 2 says that Mars always had , while perspective 1 maintains that the superposition state must have collapsed then.
The EPR paradox argues in favor of the second: the spin is a "real" property of the particle - a variable (invisible to our instruments then) describing reality. Instead of being in a superposition state pre-measurement, both particles had their spin states specified at the moment of decay and always did - Einstein's local hidden variables theory (localized to each particle).
For a while, it was thought impossible to know whether theory 1 or theory 2 was the correct interpretation ... that is, until 1964. Enter John Bell.
Bell's Theorem
Bell's theorem sets up a mathematical inequality that provides different results for a setup assuming HVT and a setup assuming the Copenhagen interpretation.
In the hidden variables theory (HVT) interpretation, this equality is where and represent the measurements taken by two observers and and denotes the expectation (or average) value of some measurement. This inequality is also called the Clauser-Horne-Shimony-Holt (CHSH) inequality.
Now, assume this was set up with qubits instead - see the Wikipedia article for more details. Summing the expectation values yields which is greater than the maximum of 2 allowed by the CHSH inequality, implying hidden variables cannot exist.
Schrödinger's Cat
Schrödinger, in a fit of psychosis, proposed putting a cat in a box. Never a good idea - but further, he decided that box should include both unstable radioactive isotopes and a bottle of cyanide! If an isotope decay is detected, boom. Dead cat. Now, this is a dumb experiment for so many reasons (animal cruelty not the least of which), but it's animal cruelty for science ... so ... yeah. That doesn't even justify it. Whatever. Don't do this experiment at home or anywhere else for that matter.
The atom has a half life of 1 hour. After 1 hour, the state of the life-deciding decaying atom is described by ... with the quantum state of the cat implied to be ... implying the cat is in a superposition state between dead and alive. Purgatory's got nothing on Schrödinger's pets. This (thankfully only) thought experiment raises two questions:
- Can macroscopic (cat) states be described quantum mechanically?
- What qualifies as a measurement? I.e. what causes the collapse of the wave function?
The Copenhagen interpretation states that, no, cats (nor any other macroscopic thing for that matter) cannot be represented by quantum functions - only classically, like with ... normal things, like what color it is, food, how fast it's going, etc. Normal things, like, by Jove, physicists are weird.
More human-centric folks have argued human consciousness causes the collapse, others that there is no collapse - rather, just a bifurcation into separate universes (the multiverse).
Feynman had the non-answer to answer this:
Shut up and calculate.
Chapter 5 - Quantized Energies
Reference Quantum Mechanics: A Paradigms Approach by David McIntyre.
Spectroscopy
Atoms are differentiated by their specific atomic structure - how many neutrons, protons and electrons they have form the basis for elements in the periodic table. Electrons form the bulk of ways we can identify one element from another - when an atom absorbs a photon, the atom responds by raising an electron up one energy level, then dropping it (emitting a photon).
Since the energy of a photon is and we can only have specific values of energies, we can identify individual atoms by their frequencies - i.e. with quantized values for energy come quantized values for frequencies.
The lowest energy state ( for hydrogen) is called the ground state, with higher levels called excited states. The set of quantized energy states is referred to as the energy spectrum of a system.
From a quantum perspective, we can visualize the energy spectrum like this:
For a system prepared in some initial state , the probability of measuring some energy shift is We can find the energy levels and their corresponding eigenstates by applying the Hamiltonian in the energy eigenvalue equation, such that
Energy Eigenvalue Equation
Also known as the time-independent Schrödinger equation since it can be derived from the S.E. by separating the time and space components.
To find the equation describing energy eigenvalues,
- Find the classical form of the system's energy
- Replace each physical observable (position, momentum etc) with their quantum mechanical operators.
For a simple moving particle, the energy is the sum of kinetic and potential energy. where is the momentum of the particle in the direction and is the potential energy.
In quantum mechanics, our primary physical observables are usually position and momentum , so our energy operator is just the quantum version of using and : where is used to fix dimensions and ensure measurable results are real (non-imaginary).
Our quantum variables and are both used in wave functions, which are really just alternate representations of quantum states:
This is called position representation since we're using the position eigenstates as the preferred basis.
The wave function representing the energy eigenstates in position representation is so our energy eigenvalue equation becomes
Right. Let's combine everything. Using with our quantum variables: simplifying,
is the potential energy function, is the wave function representing energy and is the energy eigenvalue.
The big thing that happens with wave functions: operator equations turn into differential equations.
The Wave Function
The wave function is the probability amplitude for the quantum state to be measured in the position eigenstate , with the actual probability of measuring some value being
Probability density
Since the probabilities of all measurements still sum to unity, for a continuous probability density function , all of the following are equivalent:
Similarly, if we wanted to know the probability a particle would be found between and ,
Note: the above curves of represent probability density: By multiplying by , we get rid of the unit length and end up with probability.
Generally, to translate bra-ket formulae to wave function versions:
- Replace kets with wave function
- Replace bras with wave function conjugate
- Replace braket with integral over all space
- Replace operator with position representation
So, to convert some probability amplitude : with probability To transform expectation values,
Energy Wells
Our energy eigenvalue equation is Solutions to this equation depend on what our potential energy is - and is dependent on context. Often, potential energy will resemble an energy well, such as seen below:
Some notes on vocabulary:
-
Classically-forbidden region: kinetic energy can't be negative. Since , any region where is called classically forbidden, since it would imply a negative kinetic energy. The edges of the well by each side are called classical turning points.
-
Classical turning points: locations where . The particle has only potential energy (no kinetic), and must "turn around" to go back into the well.
-
Particles within the well are in bound states, while those outside the well are in unbound states.
The extent of the allowed and forbidden regions depends on our total energy used for a particular bound state. might be less than , and might change from particle to particle.
Infinite Square Well
The classical model for a particle well is a ball bouncing between two perfectly elastic walls, like the old bouncing DVD screensaver. A simple model for a bound particle follows the same rules:
- The ball flies freely between the walls
- The ball is reflected perfectly at each bounce
- The ball remains in the box regardless how fast it is
Let's find the energy eigenstates & eigenvalues using the energy eigenvalue equation. Outside the box, , so which is satisfied only if (a nice thought) or if everywhere outside the box. Inside the box, , so We already know , and is confined to a box of size . We need to find our total energy and the wave function . Rewriting our EEQ,
Above, and is defined as the wave vector.
The solution to this equation is There are three unknowns: , and , which contains our total energy . We can find two of them with our boundary conditions: and . At , so . At with , which is zero if (implying a full-zero wave function, not very useful, or when for , allowing us to define yet another quantity :
is called the quantum number, while is the quantization condition.
If we use this quantization condition as the wave vector , then we can solve for our total energy as with allowed energy eigenstate wave functions Our final constant can be solved for by normalizing the wave function to unity, such that leading us to the final definition of our wave equation, Since probability density can be calculated from some wave equation (which represents probability amplitude) via , then our quantized probability density becomes We can visualize these wave functions for different quantum numbers (and hence energy levels) like this:
Finite Square Well
Now, our energy eigenvalue equation outside the well becomes while inside the well it remains When we were forced to always have our particle sitting in the well, we only needed to worry about our wave number - let's also now set up to represent the case when the particle is outside the well. For bound states (i.e. when both inside and outside the well), and are real, so our EEVs are with general solutions where the even, odd values are applied to the energy levels (i.e. are odd).
Let's start with the even constants. At each boundary and , the solutions are the same since we positioned our "zero point" between them. with the normalization constant providing the third equation for all three unknowns (, , ). Dividing the two above equations gives which is equivalently (by substituting in our and formulae) Similarly for the odd solutions:
We can simplify all of this by creating some new constants , and , such that so
Chapter 6 - Unbound States
Reference Quantum Mechanics: A Paradigms Approach by David McIntyre.
In bound states, the energy levels of particles are quantized - they are restricted to specific values. In unbound states, energy is no longer quantized, and operates on a continuum.
We still, however, will use the energy eigenvalue equation, its associated Hamiltonian, and the energy eigenvalue wave differential equation.
Free particle eigenstates
For a free particle, is zero everywhere, so the EEV differential equation becomes where . This has the general solution of Since we're no longer in a bound state, we no longer have constraints on extrema (except for the normalization condition) - and hence continuous energy, no longer quantized. We can apply the Schrödinger time evolution to this by multiplying the energy basis (above) by a phase factor, such that Using the Planck energy relation , This wave function represents a wave that retains its shape as it moves, with a speed determined by , the phase velocity.
Thus, represents the part moving in the positive direction, while represents the part moving in the negative direction.
We can also use to represent the wave vector eigenstates, with the sign of indicating the direction of motion.
Note: we need both positive and negative values to make a general energy eigenstate.
Momentum Eigenstates & de Broglie Relation
The momentum eigenvalue equation is
This is equivalently in braket notation.
Applying to above, implying that is the momentum eigenvalue with an associated eigenstate
Note: this is a function of position , not momentum : is the independent variable while is the particular momentum eigenvalue.
Since , we can use our momentum eigenvalue to create the de Broglie relation: The momentum eigenstates are also energy eigenstates for our free particle, with energy
This means that the momentum and energy operators commute. Any given momentum eigenstate will have some energy given by the above equation, but some energy state doesn't necessarily have a definite momentum, since a single energy state usually corresponds to 2+ momentum states - meaning the energy state is degenerate.
Momentum W.E.
The time-dependent wave equation governing a momentum eigenstate is with a probability density given by Unfortunately, this momentum eigenstate is constant regardless of position - it is spread out over all space unto . This makes the momentum eigenstate impossible to normalize - unless we create a superposition of momentum eigenstates to create wave packets.
Review on Basis States:
All basis state should demonstrate
with the orthogonality and normalization conditions able to be written in the form of the Kronecker delta: The Dirac delta function is the Kronecker delta used for continuous, rather than discrete sets, where is a function that is zero for all , except at , where it is infinite.
Thus, the orthonormality condition for the momentum eigenstate can be expressed in Dirac notation like this: which can be translated into wave function notation using the rules from Chapter 5: If we define our normalization constant then our normalized momentum eigenstates are
Momentum P.D.
If we wanted to find the probability amplitude for some general state to have momentum (i.e the projection of the general state onto momentum basis ), we'd want to find ... however, we're using in place of in our wave-function.
... but wait! and are not the same functions. They're the general state operating in different bases.
So ... let's just represent this momentum wave function with a different symbol instead. This is known as the momentum space wave function, a continuous wave function that represents the quantum state vector in terms of the momentum eigenstates.
The sinusoidal waves that make up this momentum eigenstate can be combined using the Fourier transform of to form a combined "wave packet" with a location in the basis, given by ... which we can also write in the basis with an inverse Fourier transform:
Wave Packets
Wave packets are localized superpositions of smaller waves that obey constructive and destructive interference.
The wave packet envelope and carrier often move at different speeds. There is much more to this chapter, but I think I'll call it here for now and write up a study guide for the imminent final.
Chapter 16 - Quantum Computing
In the 1980s, Feynman asked "Can a classical computer reliably model a quantum mechanical system?" - initially, the answer was no, since the Hilbert space scales based on the number of particles involved by a factor of ...
... except ... nature does this anyway, regardless of the complexity involved.
Thus, enter the quantum computer - at least, the idea of it.
Qubits - Quantum Bits
Classical computing uses binary digits (bits) to store information: 0 or 1. These numbers are strung together to represent larger numbers (1000101), but fundamentally use transistors that are either on or off (binary states).
In quantum information systems, information is stored in quantum bits (qubits) - each a two-state system: The key difference is that qubits can exist in superposition states - where the state of the processor isn't only 0/1, but rather with probabilities of measuring each state as or , as opposed to classically 100% either 0 or 1. As we include more qubits, we exponentially increase the information storage capacity of the system: and the general superposition state is For an -qubit system, a single superposition state contains coefficients to represent information - compared to a classical system, where though bits has possible states, each state contains only bits of information.
However - if we measure a state, the system state vector collapses from superposition onto the measured state vector - so we can only extract pieces of information from our system. We can get around this with quantum parallelism, as a consequence of quantum entanglement.
Entangled States
The EPR state is entangled because measurements on one spin are perfectly anti-correlated with measurements on the other spin. It's the fourth of a set of Bell states, the complete set being This set is known as the Bell basis. Measuring one qubit in an entangled state affects the other (regardless of its location) instantaneously, allowing us to achieve the aforementioned parallel measurements - as long as we're clever about data organization.
We can also have 2-qubit product states which are expressed as a product of 1-qubit states
Quantum algorithms are not immune to the probabilistic nature of quantum mechanics - if the same program is run twice on a quantum computer the result may not be the same twice over. However, it allows us to produce answers in many, many fewer steps than a classical computer.
This is why the idea of a binary-quantum processor is popular - both have strengths and weaknesses that each "half" makes up for.
Extinction
This is how we quantify how a star's magnitude changes with the airmass between the observer and the observed star.
Unfortunately, extinction varies with wavelength, since the sky absorbs some wavelengths more intensely than others ... so we'll see different values of for each filter used.
To correct for this, let , where is a color (i.e. ) - then, 2nd-order extinction is normally quite small (~0.04 magnitudes), but still important to include for high-precision photometry & blue stars. It can be determined with a close pair of different-color stars by fitting the slope of vs (since is a constant change-in-color for the two) - we end up with
Image Reduction Pipeline
Use several Jupyter notebooks, rather than a single "The Master Notebook" - it can get long and big and slow.
- Take calibration frames (biases, flats (dome and/or sky) and darks, if necessary)
- Observe targets (science, standard stars and airmass-constraint stars (extinction stars))
- Remove instrumental signature from observations using calibration frames.
- Measure stellar fluxes (i.e. with aperture photometry or PSF photometry)
- Solve for extinction coefficients / the space magnitude
Charge-Coupled Devices (CCDs)
The CCD process:
- Light hits some detector array, triggering a tiny detector (typically 10-15 in size)
- Detector translates energy of the incident photon to some digital value
- Software confirms the receipt and reconstructs an array from the values.
High quantum efficiency (electrons per incident photon), and also generally linear (though not as we approach blue, where QE falls off sharply - though total quantum yield (number of electrons extracted per incident photon) increases).
Additional Reference: see the Hubble Space Telescope docs.
CCDs vs CMOS
Note: each element of a camera array is a CCD or a CMOS chip.
- CCDs do the photon-to-electron conversion at the CCD chip - then the entire array is converted at some analogue-digital (A/D) converter at the end.
- CMOS chips (such as in phones) both detect electrons and have built-in A/D converters on every CMOS array element - so each CMOS chip is able to report its current receipt value.
The problem is CMOS chips heat up quickly, which causes thermal noise on the image. CCDs also generate heat in the A/D chip, but to a much smaller extent which we can correct for a bit using a cooling device.
CCD Noise
Random noise from:
- Read noise (incl. A/D 'digitization' noise)
- Poisson noise - originates from the quantization of electric charges.
is the number of photons detected, is ... what?
Systematic noise from:
- Bias level - like ISO. Applied before the A/D converter to ensure the read noise is interpretable. Since the received photons will usually follow a Poisson distribution and look like a Gaussian curve, we "shift it" to avoid below-zero counts - corrected for with bias frames.
- Dark current - originates from the camera's electrical construction itself, from the random generation of electrons and holes within the depletion region of the camera itself - corrected for with dark frames.
- Multiplicative noise - non-uniform illumination, other sources, hot pixels, etc (often called "fixed pattern noise") - removed by using flat frames.
Characterization
We can characterize the noise and linearity of detectors using Photon-Transfer-Curves (PTCs):
Time
Definitions:
- Solar time: noon is defined as when the sun crossed the meridian.
- Mean solar time: allows each day to have the same length.
- In 1884, London decreed the Mean Solar Time would become Greenwich Mean Time (GMT) ... which then became UTC.
- ... which was then modified to become independent from GMT, now known as UT, as the time when stars cross the meridian (transit).
- UT Has versions - UT1 is most popular, UT0 or UT2 are also used (rarely).
- Standard stars used to determine UT.
- Transit telescope: slews along the meridian (north / south) to tell when a star moves across - does not slew east-west.
- Julian date: count of days since noon on January 1, 4713 BC.
- Created in 1582 by Joseph Scaliger to help make history line up.
- Synchronized with UT (i.e. J2000 is Julian 2000 w.r.t UT)
- MJD is Modified Julian Date - first 2,400,000.5 days subtracted to make the number shorter.
- i.e. measured since 1857 CE rather than 4713 BCE.
Days:
Sidereal (or stellar days) are measured with respect to the stars. Solar days are measured with respect to the sun - slightly longer because the Earth moves approximately 1 degree around it's orbit.
LST and Hour Angle
Local Sidereal Time (LST) and Hour Angle are the way we navigate using equatorial coordinates.
Spectrographs
Spectrographs:
- Take light from a source (into the telescope)
- Pass it through a slit
- Collimate it (force all beams to pass in a single direction)
- Push it through a dispersing element (like a prism to separate it)
- Pass it through a camera (to reform it)
- Send to the observer
Diffraction gratings are often used as the dispersing element (in place of a prism) because prisms can get really expensive.
Sometimes, folks will sandwich prisms, gratings and another prism, to form a GRISM - this is the case at APO's KOSMOS.
Taking spectra
We don't need great seeing for spectra - so they can be useful to take on nights with low seeing. However, spectra require long exposure times for reasonable S/N ratio in every pixel.
Guiding can be done by a "slit view" camera (i.e. spectrograph in center, normal otherwise) or via an off-axis guider (like at MRO).
Resolving power
For spectra, resolution is defined as
which can be for high-resolution spectra. is generally limited by slit-width.
Resolution can also be related to Doppler shift via
Wavelength calibration
- Find a lamp spectrum (i.e. using a dome light)
- Extract an ID spectrum using the same aperture and trace
- Using known lines and their pixel values, determine a dispersion solution
- Apply it to the 1-D spectrum
Checklist
Arrival
-
Log the level of the water tank. The water tank is located down the stairs heading toward the bedrooms, in the utility room on the right. The light to this room is (most inconveniently!) located on the left side of the doorway. Measure the water level by opening the valve to the gauge marked "in. H2O", which is located just above the water pump. The clear tubing will also match the level in the tank.
- Measure each day - refill if under 1/4 full.
- Prepare the water system.
- Turn on the water pump with circuit breaker 8 on panel M.
- Flush both toilets, which helps avoid the dreaded sewer backup!
-
Check the level of the water tank on a daily basis so as to avoid the (also dreaded) low water alarm.
- Do not turn on the water heater.
-
Prepare heat and air conditioning.
- If heat is needed - flip switch from 'Off' to 'Heat' on per-room basis.
- If AC needed, uncover the vents outside just north of the main entrance and turn on the AC circuit breaker.
- Add potable water to foot pump - in kitchen, put the kitchen tube in one of the big blue water tanks.
-
Initialize camera - on Sleipnir the Slow, open Evora (
http://localhost/
on Sleip, orhttp://72.233.250.83/
elsewhere on the network).- Initialize and set temperature to -82° C.
See Telescope Setup & Calibration for information on preparing the telescope for observation.
Departure
- Put away all media items (books, games, cables) in living room & observing room.
-
Kitchen:
- Wash dishes & clean up kitchen - wipe the sink down too.
- Collect trash & recycling to bring back.
- Refill kitchen and restroom soap dispensers w/ big soap bottle (add water if foaming dispenser).
- Restrooms - clean restrooms, wash toilets & sinks, sweep the floors.
- Cleaning the floors - vacuum floor upstairs, sweep floor downstairs.
-
Water cleanup:
- Measure water level and log it in the Google Doc.
- Remove tube from water jug under sink & seal it.
- Electricity:
- Turn off water pump circuit breaker.
- Unplug appliances & turn off power strips (don't turn off observing room computers).
- Turn off lights.
- Turn all thermostats off.
- Groceries - take pictures of inventories - note anything the next group should pick up from the store.
- Water jugs - put empty blue water jugs and empty clear plastic gallon jugs (if you can fit them) in car.
- Finalization - close living room shutters & all interior doors, and lock them.
This guide is shortened from the official version, available at https://sites.google.com/a/uw.edu/mro/checklist.
Telescope Setup & Calibration
- Initialize camera - on Sleipnir the Slow, open Evora (
http://localhost/
on Sleip, orhttp://72.233.250.83/
elsewhere on the network).- Initialize and set temperature to -82° C.
- Press 'Home' on the filter wheel.
- Initialize telescope - open Bifrost on Heimdall.
- 'Initialize telescope systems'
-
Make sure control paddle is set to
set
notslew
-
Turn telescope on with key on RA skirt in dome
- Confirm 4 green lights in / around mount
-
Slew to zenith - use bubble level and hand paddle to bring scope to zenith.
- 'Load Zenith Coordinates' on Bifrost once this is done, then 'Update Telescope Coordinates'
Wait till sunset now.
-
Open dome
- Unclip dome power cables
- Plug in 3-prong lower shutter door motor, and round upper-shutter motor cable (note the upper-shutter cable has a specific orientation to follow).
- Raise upper shutter until the lower shutter motor turns on. Continue raising upper shutter, but you can now open the lower shutter.
- Once both are fully open, re-clip power cables where they were.
-
Remove telescope covers
- Make sure telescope is at zenith with correct RA/DEC in Bifrost
- 'Slew to Cover Position' in Bifrost - watch telescope cables. Emergency stop if it looks like anything's going to get caught.
- Use the ladder to remove the main telescope cover & the 6" finder cover. Store both on south table.
-
Take bias exposure
- It should look similar to this example bias.
-
Take twilight flats
- Move the telescope ~20 degrees east of zenith (west for morning flats)
-
Calibrate pointing
- Load
pointing_stars.txt
in Bifrost and select a bright start with a low airmass. - Turn on tracking, slew to the star and then try to center the star in the finderscope manually with the Xbox remote. Click 'Update Pointing to Target' once it's lined up.
- Check another bright star nearby to be sure. Fine-tune to get to center-ish of FoV.
- Load
-
Calibrate focus - use a dimmer star for this one
- Goal is to minimize star PSF. Use focus helper in Evora for this - large leaps in focus increments on Bifrost are okay.
- Remember: focus is a relative measurement on the telescope, not absolute - "zero" focus is physically wherever the focus was when Bifrost booted up.
End-of-night
If dome flats needed: close dome, turn on dome flat light and point telescope at a relatively matte / flat area. Take flats as needed, then follow procedure below.
- Shutdown Evora.
-
Attach covers
- 'Slew to Cover Position' in Bifrost, then replace covers. Pay attention to cables.
- 'Park Telescope' to slew back to zenith
- Turn off the telescope with the key, then close Bifrost.
-
Close dome
- Rotate dome so cables are near ports
- Plug cables in - close smaller shutter (3-prong plug) first, followed by upper shutter
- After dome is closed, unplug and reclip cables.
Calibration Frames
Frame type | Details | Exptime | Num | Periodicity |
---|---|---|---|---|
Bias | Readout noise on CCD - 'static' you see on low exposure times. | 0 | 50-100 | Few weeks |
Flat | Help mitigate effect of dust and lens vingetting on lights - also acts as per-pixel correction factor. | 1-50 | ~50 | Every run |
Dark | Try to capture noise from temp and ISO over length of exposure time. | LIGHTs | At least 5, ~20/exp | Every new exp |
Filter notes:
- Bias frames are filter-irrelevant.
- Flat frames are filter-specific - 1 every 5 exptime steps should work.
- Rerun darks every once in a while - build a library with ISO, temp and exptime.
- For sky flats, take them when sky is as bright (twilight) as possible with least sensitive filter, tracking on, with some offset between each exposure ... OR ... no tracking and allow startrails.
- Allows for variations in star intensities (i.e. frame 1 has a bright star, frame 2 only dark stars)
Telescope notes:
Bifrost
does NOT record airmass in FITs header ... nor UTC or RA or DEC.- The MRO telescope will start to create "footballs" at ~12-15 minutes due to imperfect tracking - we could better this with a "real" pointing module.
Resources & Info
Resources, links and reference information for Manastash Ridge Observatory.
Location: 46.9511°N 120.7245°W
Resources
Evora Client - frontend website for controlling the MRO CCD camera.
Evora Server - backend to work with the Andor SDK 2.
Git cheat sheet - really lovely design, covers most use cases
Exoplanet transit calculator (NASA) <-- this should automatically set up for MRO
Exoplanet transit calculator (TESS)
Camera Info
The CCD is an Oxford Instruments Andor iKon-M 934.
Links
Fri. May 24 - Sat. May 25
José, Dylan, Eliz, Parker, Matt, Yasin, Ishan
Arrival
Acquired copious amounts of soda from Fred Meyer due to a bizarre buy-two-get-three deal. Arrived at MRO around 7:00 PM, skies clearing up from earlier rain, mid-50s temperature - hoping for clear skies tonight.
Checklist notes: Water level in tank after turning on the pump at 20% - refill next time.
Night - Fri. May 24
Notes:
- Always use the target list to start slewing. Make sure telescope is able to slew from Vega to Arcturus.
- Started the night aligning the finder scope with the 0.8m.
- Started observations around 3:30 AM - see problems log below.
- Nonlabeled flats are sky flats - domeflats are indicated such in header
- Wrote bash script to take domeflats automatically
Object | Exptime | Filter | File | Notes |
---|---|---|---|---|
M57 | 10 | V | ecam-81 | |
M57 | 50 | Ha | ecam-82 | |
M57 | 30 | B | ecam-83 | DUST |
M57 | 10 | B | ecam-84 | DUST :( |
GJ 1243 | 35 | r | ecam-(85:105) |
Problems:
- Focus issues early on that we worked on until midnight - emergency stop of Bifrost software meant we lost track of current focus position. Eventually reset.
- Slewing / connection issues with Bifrost
- DUST - see ecam-83
- Same dust particles different filters? ... filter may not have moved or dust on CCDs.
Sat. May 25 - Day
Forgot to disable tracking while doing domeflats - telescope slewed entire night. Spent morning re-acquiring domeflats.
Pulled off CCD to see if dust effects were on it.
Dusty darling | Better |
---|---|
Wish List
Resupply
- AA batteries
- Recycling bin
- Sponges 🙏
Projects
-
Web interface to interface with both
evora-server
andBifrost
- Dome control in web interface
- Weather details + webcam in web interface
- Autofocus mechanism - APO-style
-
Multi-filter series exposure planning
- Bash script already set up
- Old raspberry pi w/ wifi to act as an audio server (bluetooth and airplay, spot maybe)
Fri. June 21 - Sat. June 22
Oliver, Carter, Bruce, Maggie, Daniel, Thomas, Ruby, Parker, Anika
ASTR 481 Trip - Orientation
Fri. June 21
Arrived to observatory around 3:30 PM - after tour we made (slightly undercooked :( ) baked potatoes on the grill as well as beans / guac - ended up acceptably tasty.
Set up audio server to allow AirPlay, Spotify Play and Bluetooth - see below.
Around 11:00 PM started observatory setup.
- Sleipnir didn't boot for a long time - see details below.
- After Sleipnir woke up, we ran into issues with
telescopepi
not starting - the mountedtelescopepi
is a Pi 2 with an old version of the filter control software. The new version did not work withevora-server
. Fixed corrupted Pi 2 with the old filter control software + restarted, no troubles after.- Old filter control vs. new?
Acquired Vega and Arcturus, until clouds moved in making further observation difficult. Trip snoozed around 3 AM.
Grocery List
- Out of white sugar, almost out of brown sugar
- Ginger powder as a spice
- Paper towels
- Aluminum foil
- Coffee creamer
- Plastic forks, napkins, paper plate
Inventory
Sleipnir
Sleipnir was acting slower than usual - ironically named. Boot hung up on something to do with SGX
- apparently okay to ignore, but boot logs also showed some issue with an md5sum
failed verification on a portable disk image (paraphrased). StackExchange answers claimed it was probably something to do with Mint, no solutions provided. Booting takes ~3 minutes as a result, in addition to the slowness of ubuntu
.
To test whether the bottleneck was software or hardware, installed another distro alongside ubuntu
- shrunk ubuntu
disk 100 gigabytes and installed manjaro
on it. Note - boot drive backup made at /home/mrouser/.backup
just in case.
Results: Manjaro booted much faster, but still some of the same slow speeds persisted, even with no services running - so I bet it's hardware. Hard drive is a Toshiba MQ01ACF050 - 7200 rpm, Pentium 4 processor.
Audio Server
Set up an audio server connected to the 3.5mm jack on the vinyl-amp-thing - it supports connections via Bluetooth, AirPlay and Spotify Play, and can be further configured at http://audio.
As it runs on a Raspberry Pi, make sure the little black box with the fan is powered on and shows a red / green lights - if you run into any issues just restart it.
Thurs. Jul 11 - Mon. Jul 15
José, Naim, Parker, Ryan, Anika, Daniel
ASTR 481 Trip - Team 3 Trip 1
Week Goal(s)
Characterize Evora, the iKon-M 934 camera that MRO uses as a CCD imaging camera, by obtaining extinction and transformation coefficients.
While here, our plan is:
-
At the start of each observing run, take 10 bias frames to act as "per-night calibration frames".
- Thurs. 11
- Fri. 12
- Sat. 13
- Sun. 14
-
Create a set of <20 dome flats to find the shortest accurate shutter speed without seeing the remnants of the shutter.
- Done Sat. 13
-
Create a set of <15 sky flats to show linearity from minimum to CCD saturation.
- Done Sat. 13
-
Take a few darks to estimate Evora's dark current in counts / min.
- Done Fri. 12 & Sat. 13
-
Take 5 dome flats in and filters at around 80% of the point that nonlinearity begins (around 40,000?)
- Done Sat. 13
-
Take 5 sky flats in and filters around 80% of the nonlinearity point (similar to dome flats)
- Done Sat. 13
-
Observe an extinction star field over the course of several airmasses
- Thurs. 11 & Fri. 12
-
Observe standard stars at the lowest possible airmass
- Done Fri. 12
... and to observe some DSOs of interest.
Possible Interesting Objects
Veil Nebula | Iris Nebula | Eagle Nebula | North American Nebula | Ring Nebula | Dumbbell Nebula | |
---|---|---|---|---|---|---|
Alt. Name | NGC 6960 | NGC 7023 | M16 | NGC 7000 | M57 | M27 |
Thurs. June 11
Stopped by Uwijimaya and Fred Meyer along the way to collect supplies for 5 nights' stay. It's nice out today - the skies are clear, temperatures high 70s. No bugs onsight in the observatory which was a welcome change. Started our epic dinner saga off by making sushi rolls!
After sunset, we started by calibrating the telescope. An issue that came up early in the night was that, for some reason after updating pointing to a target (i.e. once we got lined up with Vega), the telescope would stop tracking (without indicating such on Bifrost). This was resolved by just stopping and starting tracking again - still, curious.
Spent a few hours focusing the telescope and making sure it was pointing alright. Spent the remainder of the night taking pictures of GD 336.
Fri. June 12
Quiet day today. Folks started waking up around noon - we added the spare darkroom monitor to the observing room to try getting some more real-estate on Sleipnir.
Garlic brown butter tilapia for dinner - Ryan made some rice and Anika a salad to accompany our meal. Observations started tonight with little fuss - we continued with our extinction field GD 336 initially, then stars in our standard field SA38.
We noticed around 2:00 AM that the red lights in the dome might've been visible in the images themselves (tested with on and off) - decided to keep them on since they'd been on for the whole night thus far. More testing needed.
Left: SA38 with lights on. Right: SA38 with lights off.
To end the night, grabbed some pictures of the Dumbbell Nebula (M27) and the Ring Nebula (M57).
Sat. Jul 13
Waffles for breakfast - started a Smash Bros Subspace Emissary campaign today within which we led a campaign of carnage (on normal mode).
Later, using SIRIL, I processed our pictures of M57 in false color ( to red, to blue and to green):
And the dumbbell nebula using a similar process
We hadn't applied our calibration frames yet, so I wonder how different they'd look if we did - let's try it once we get the master frames set up. Also each of these was only a combination of the 3 exposures - I wonder if we did a whole bunch of exposures whether we'd get a much higher SNR.
Wishlist item: a non-relative focusing mechanism. The focus seems to "drift" overnight and it can be hard to know when we're on target.
At sunset, we took our sky flats, then some dome flats and darks to finish up our first project report. Since we still had an hour or so until astronomical twilight, we tried training the telescope on the Moon - and saw this!
Since the Evora control software was modified to allow <1s exposure times, we were able to take ~0.3s exposures of the lunar surface.
After this, we sat down and watched the first Alien (1979).
Fun fact, the actor who played Bilbo Baggins in the Lord of the Rings (Ian Holm) was the same who played the science officer, Ash.
Sun. Jul 14
Cinnamon rolls in the AM, then more Brawl. Swamp is a hard level. Cleaned up some in the afternoon, then went on a hike in the evening. I think it was called Umpdenum Falls? When we returned, Ryan made some hotdogs with pickled onion - everyone was famished, so the meal was extra good.
Tried to take a look at some variable stars that needed observation on AAVSO, but we chose one that was a bit too dim (mag 16) to be distinguished easily - it was visible with a ~200 second exposure, but the SNR was so high that it was almost indistinct - so we packed it in for the evening, anticipating an early day on Monday.
Mon. Jul 15
Hoping to return to Seattle with some time to spare, we woke up around 8:00 AM and started cleaning, grabbing breakfast while we ran around getting everything back in order.
Inventory notes made in the class Doc and checklist followed, we packed our gear up and headed out around 8:45 AM, grabbing coffee in Ellensburg on the way back from Jenikka's.
Using the PIC16F88 as an I2C device
Final project for PHYS 335A: Digital Electronics, taught Spring of 2024 by Dr. David Pengra at the University of Washington.
The PIC16F88 is a 16-bit, lightweight microcontroller from Microchip devices which is useful for a variety of low-cost and low-power, high speed embedded systems operations.
Here I describe some details of how I try to set up an IC implementation on the PIC16F88, and design a program using MPASM Assembly language to operate it as an I2C target device (slave), to be controlled by higher-level controller (master) devices such as Arduinos or Raspberry Pis.
Note: MPASMx is no longer supported as of MPLAB X v5.40 - use >v5.35 if you're intending to use it here. See the footnote1 for more details on this & using OSx.
Introduction to I2C
IC (or inter-integrated circuit) is, like so many things in electrical engineering, a complicated-looking protocol that's actually pretty simple in a clever sort of way.
When compared to its closest serial communications cousin SPI (or serial peripheral interface), IC only requires 2 datalines (SDA and SCL) compared to SPI's 4 (MISO, MOSI, SCLK and CS), and allows for far more devices to be connected due to the use of peripheral addressing rather than SPI's chip select.
The master device / controller, in my case an Arduino Mega 2560, specifies some baudrate to operate the clock frequency on. When no data transmission is happening, both SDA and SCL are pulled to Vdd
through the pullup resistors ( above - typical choices range from
4.7k) to 10k).
Since this implementation of IC is open-drain, the controller and target oscillate lines by just directly grounding them - the pullup resistors avoid shorting the circuit out this way. Note this means that the longer SCL=0
or SDA=0
, the more power the circuit will use by extension.
Pullup resistor choice is important if you're trying to optimize - small PU resistors have high power consumption but also high speeds, while large PU resistors have low power consumption but also create capacitance delays. Texas Instruments has a good description of this.
By my understanding, IC (in 7-bit addressing space2) generally operates like this:
- Controller device sends a
START
bit by holding SDA low with SCL high. - Controller sends a 7-bit address, one bit per clock cycle.
- At the 8th bit (8th time SDA goes HIGH), target devices compare the sent address with their own. If the two match, continue. Otherwise, ignore - this stream isn't meant for that device.
- Controller and specified target communicate.
END
bit set by either to indicate the conversation ifs over.
Each 'communication sequence' happens in 8-bit intervals - 7 bits for data, the 8th allowing time for each device to tell the other what to expect next. The PIC16F88 SSP section of the datasheet has a handy visualization of the target transmitting data back to the host - what we're planning on doing here.
So - let's talk about the project.
I2C Rangefinder Project
We'll use the PIC16F88 in conjunction with the HC-SR04 ultrasonic rangefinder module to create a rangefinding device, retrieving the rangefinder values from the PIC by communicating over IC.
For our controller, we'll use an Arduino Mega 2560 - chosen because it allows serial console monitoring & has 5V IC logic by default (the PIC16F88 uses 5V, and thus has 5V IC outputs), and has a number of high-level logic libraries for interfacing with IC (see the Wire library for more information).
It is possible to use an IC controller with 3.3V logic (such as a Raspberry Pi) with 5V target devices or visa-versa, however you'll need to convert logic levels between the two, either with a dedicated logic-level conversion device or by using N-channel MOSFETs and pull-up resistors. Refer to NXP note AN10441 for more information.
Hardware
Making a circuit diagram for this,
For my pullup resistors I decided to use 4.7k, as I'm not particularly worried about power consumption on a proof-of-concept. Note that RA<x>
refers to pins PORTA<x>
, and similarly RB<x>
refer to pins PORTB<x>
.
Note: it's quite important that the controller and target devices share a common ground - otherwise, the "reference ground voltage" for IC pulses may not be the same and a circuit may not function as hoped!
Here's the physical implementation of this circuit.
Software
First, I made sure code for the rangefinder was working before fiddling with IC. I made use of some of the code provided in Lab 8 - Sonic Ranger with Interrupts & Hardware Control, ensuring the rangefinder worked by tinkering with the debugger and observing variable changes in response to changed echo distances from the rangefinder. This could have also been done by reusing our Lab 7 code, but I thought the Lab 8 code looked much neater.
Now - onto the software side of IC. The SSP section on IC in the PIC16F88 datasheet is unfortunately quite lacking with regards to exact implementation methods, so I was forced to scrounge around a bit through sources official and otherwise for information.
In brief, the PIC16F88 doesn't easily support a controller-mode implementation of IC out of the box, lacking the MSSP register found in other PICs that would allow easier implementation.
Instead, the 'F88 supports a variety of both SPI and IC target device implementations that may be used instead - refer to Register 10.2 in the datasheet SSP section. If you're looking to implement master-mode on a PIC, look to other devices such as the PIC16F1508 which have MSSP registers.
Using the PIC16F88 datasheet section on SSP (paying particular attention to section 10.3.1) as well as the section on interrupts and the section on configuring PORTB, here's the process I followed to implement "IC Slave mode, 7-bit address with Start and Stop bit interrupts enabled".
- Clear
ANSEL
to enable digital (instead of analog) inputs. - Set
TRISB<1>
(SDA) andTRISB<4>
(SCL) pins, turning them into inputs.- "But wait! IC is bidirectional!" you might argue - this is true. The SSP (synchronous serial port) module in the PIC16F88 will automatically set and clear
TRISB<1>/<4>
in response toSDA
andSCL
events.
- "But wait! IC is bidirectional!" you might argue - this is true. The SSP (synchronous serial port) module in the PIC16F88 will automatically set and clear
- Set bits in
SSPCON
(SSP control register):SSPCON<5>
-SSPEN
, enables SSPSSPCON<4>
-CKP
, clock polarity, allowing controller to oscillateSCL
SSPCON<3:0>=1110
-SSPM<3:0>
, "SSP mode select". See Register 10.2 in the SSP section of the datasheet for other options - by using1110
, we're setting our SSP mode to IC target mode with 7 bit addressing, start/stop interrupts enabled.
- Choose an address for your device. It must not be
0x00
or0x01
, other values allowed up to0xFF
. I chose0x77
arbitarily. Write it intoSSPADD
.- Since only bits
SSPSR<7:1>
are compared toSSPADD
, run anRLF
instruction (rotate-left-through-carry) onSSPADD
to make sure your address matches the checked one.
- Since only bits
- Enable interrupts. Set bits:
INTCON<7>
-GIE
, enable global interruptsINTCON<6>
-PEIE
, enable peripheral interruptsPIE1<3>
-SSPIE
, enable SSP interrupts
- Create interrupt functions in your ISR space
- Check if
PIR<3>
(SSPIF
) is set to make sure the interrupt was caused by IC, if there are multiple possible interrupts in your code. - Function to
Save
STATUS
andW
as you normally would. WriteData
- to send data, write (less than 8 bits total) toSSPBUF
, then setSSPCON<4>
to indicate to the controller that you (the target) are ready to transmit.SSPBUF
will now start automatically sending.- Function to
Load
as you normally would - Clear
PIR1<3>
(SSPIF
) to reset the SSP ...
- Check if
- ... and
RETFIE
.
My ISR implementation following 6-7 above looks like this:
; Interrupt Service Routine ----------------------------------------------------
ORG 0x0004 ; ISR beginning
SaveState
MOVWF SAVE_W ; Save W register
SWAPF STATUS, W ; Save STATUS reg
MOVWF SAVE_STAT ; ... into temporary reg
LoadAndSend
MOVFW TimerCounts ; Load last pulse period into W
MOVWF SSPBUF ; Load SSPBUF with W (will be sent)
BSF SSPCON,CKP ; Set CKP bit to indicate our buffer is ready
LoadState
SWAPF SAVE_STAT,W ; Load STATUS
MOVWF STATUS
SWAPF SAVE_W, F ; Load W
SWAPF SAVE_W, W ; Load W into W
BCF PIR1,SSPIF ; Clear serial interrupt flag
RETFIE ; ... and return to program execution
; End ISR ----------------------------------------------------------------------
and relevant IC code blocks like this:
; I2C initialization subroutine ------------------------------------------------
SetI2C
BANKSEL ANSEL ; Bank 1
CLRF ANSEL ; Set to all-digital inputs
BSF TRISB, TRISB1 ; Set PORTB<1> (SDA) as an input
BSF TRISB, TRISB4 ; Set PORTB<4> (SCL) as an input
CLRF SSPSTAT ; Reset SSPSTAT
BANKSEL SSPCON ; Bank 0
BSF SSPCON, SSPEN ; Turn on SSP
BSF SSPCON, CKP ; Enable clock (if 0, holds clock low)
BSF SSPCON, SSPM3 ; I2C 7-bit slave-mode w/ int is SSPM<3:0>=1110
BSF SSPCON, SSPM2
BSF SSPCON, SSPM1
BCF SSPCON, SSPM0
BANKSEL SSPADD ; Bank 1
MOVLW I2C_ADDR ; Load I2C address into W
MOVWF SSPADD ; Load SSPADD with I2C_ADDR (0xIC random)
RLF SSPADD,F ; Left-shift since SSPSR<7:1> compared
; Interrupt configuration
BANKSEL INTCON ; Bank 0
BSF INTCON, GIE ; Enable global interrupts - SSP interrupt on START
BSF INTCON, PEIE ; Enable peripheral interrupts
BANKSEL PIE1 ; Select PIE register
BSF PIE1, SSPIE ; Enable SSP interrupts in peripheral interrupt reg
; End subroutine ---------------------------------------------------------------
Results
Using the Arduino IDE with the Wire library, I set up a basic script that, after initializing IC and the serial monitor both at a baudrate of 9600, would send an IC "read" query down SDA
to address 0x77
(our PIC) to try and retrieve our TimerCounts
value from our interrupt function LoadAndSend
shown above.
#include <Wire.h>
// Notes: D21 is SCL, D20 is SDA for MEGA 2560
void setup() {
// Initialize I2C
Wire.begin();
// Start serial output for console monitoring
Serial.begin(9600);
Serial.println("Setup complete. Starting loop ...");
}
// Poll 0x77 address - then sleep for two seconds - then poll.
void loop() {
// Request 2 bits from 0x77
Wire.requestFrom(0x77, 1);
// Send results (if any) to serial monitor
while (Wire.available()) {
char c = Wire.read();
Serial.print("Distance: ");
Serial.print(c, DEC);
Serial.println(" cm");
}
delay(2000);
}
And, after some tinkering, checking the serial monitor revealed ...
Voilà!!
Notes
I definitely didn't get this first try - I wrote the Arduino script pretty early on to monitor the PIC16F88 while I figured out the maze of interrupt flags necessary to work with IC.
Something I should note - I've wasted at least a few hours during this project because I wasn't operating on the correct bank while trying to operate on some register. Such problems won't often be immediately apparent, only becoming clear when you step through your code with a debugger and realize a value isn't changing when it should.
BANKSEL
is a friend!
Code
The full program I used for this is written below.
;*******************************************************************************
;
; Filename: Final Project I2C Ranger
; Date: 5/31/2024
; File Version: 1.0
; Author: Parker Lamb
; Description: Sets PIC16F88 up as I2C-ready ultrasonic rangefinding device.
;
;*******************************************************************************
;*******************************************************************************
;
; Procesor initial setup
;
;*******************************************************************************
list F=inhx8m, P=16F88, R=hex, N=0 ; File format, chip, and default radix
#include p16f88.inc ; PIC 16f88 specific register definitions
__config _CONFIG1, _MCLR_ON & _FOSC_INTOSCCLK & _WDT_OFF & _LVP_OFF & _PWRTE_OFF & _BODEN_ON & _LVP_OFF & _CPD_OFF & _WRT_PROTECT_OFF & _CCP1_RB0 & _CP_OFF
__config _CONFIG2 , _IESO_OFF & _FCMEN_OFF
Errorlevel -302 ; switches off msg [302]: Register in operand not in bank 0.
;*******************************************************************************
;
; Constants and variables
;
;*******************************************************************************
; Program vars
TimerCounts EQU h'20' ; Saving timer counts
; vars
SAVE_W EQU h'21' ; Interrupt temporary W storage
SAVE_STAT EQU h'22' ; Interrupt temporary STATUS FSR storage
; I2C status
I2C_STAT EQU h'23' ; Check if I2C is connected
I2C_ADDR EQU 0x77 ; I2C address (const)
; Delay count registers
DInd1 EQU h'24'
DInd2 EQU h'25'
; Delay times
DTime1 EQU .199 ; 60 ms delay - outer loop
DTime2 EQU .60 ; Nested loop runs for 59,941 cycles
; SCL and SDA locations
#define _SDA PORTB,RB1 ; Make _SDA easier to check
#define _SCL PORTB,RB4 ; Make _SCL easier to check
;*******************************************************************************
;
; Memory init & interrupts
;
;*******************************************************************************
ORG 0x00
GOTO Init
; Interrupt Service Routine ----------------------------------------------------
ORG 0x0004 ; ISR beginning
SaveState
MOVWF SAVE_W ; Save W register
SWAPF STATUS, W ; Save STATUS reg
MOVWF SAVE_STAT ; ... into temporary reg
LoadAndSend
MOVFW TimerCounts ; Load last pulse period into W
MOVWF SSPBUF ; Load SSPBUF with W (will be sent)
BSF SSPCON,CKP ; Set CKP bit to indicate our buffer is ready
LoadState
SWAPF SAVE_STAT,W ; Load STATUS
MOVWF STATUS
SWAPF SAVE_W, F ; Load W
SWAPF SAVE_W, W ; Load W into W
BCF PIR1,SSPIF
RETFIE ; ... and return to program execution
;*******************************************************************************
;
; Program start
;
;*******************************************************************************
Init ORG 0x0020
; Set up oscillator to 4 MHz
SetOsc_4MHz
BANKSEL OSCCON
CLRF OSCCON
BSF OSCCON, IRCF1 ; Bit 5
BSF OSCCON, IRCF2 ; Bit 6
; Tuned value from Lab 7 - adjust as needed
MOVLW 0x16
MOVWF OSCTUNE
; Reset IO to all digital, all outputs, clear latches
ResetIO
BANKSEL PORTA ; Clear data latch registers
CLRF PORTA
CLRF PORTB
BANKSEL TRISA ; Data direction on PORTA / PORTB
CLRF TRISA ; Set PORTA to all output
CLRF TRISB ; Set PORTB to all output
BANKSEL ANSEL ; Select digital / analogue register
CLRF ANSEL ; And set to all-digital inputs
; ... and set what we need to
SetIO
BANKSEL TRISA
MOVLW 0xFF
MOVWF TRISA ; Set PORTA direction to all inputs
BCF TRISA, RA0 ; ... except PORTA<0>, an output
SetTMR0
; TMR0 duration between pulses indicates our radar distance
BANKSEL OPTION_REG ; TMR0 operation controlled via OPTION_REG
CLRF OPTION_REG ; Clear it - sets PSA to TMR0, low-hi edge
BSF OPTION_REG, 0x0
BSF OPTION_REG, 0x2 ; Set prescaler rate to 1:64 w.r.t oscillator (101)
BANKSEL INTCON ; Select interrupt control register
BCF INTCON,TMR0IE ; Disable TMR0 OF interrupt (not using here)
BCF INTCON,TMR0IF ; Clear flag (maybe unnecessary - do anyway)
;*******************************************************************************
;
; I2C configuration
; We're using this device in 7-bit I2C slave-mode with stop + start
; interrupts enabled.
;
;*******************************************************************************
SetI2C
BANKSEL TRISB ; Bank 1
BSF TRISB, TRISB1 ; Set PORTB<1> (SDA) as an input
BSF TRISB, TRISB4 ; Set PORTB<4> (SCL) as an input
; BANKSEL SSPSTAT ; Bank 1
CLRF SSPSTAT ; Reset SSPSTAT
BANKSEL SSPCON ; Bank 0
BSF SSPCON, SSPEN ; Turn on SSP
BSF SSPCON, CKP ; Enable clock (if 0, holds clock low)
BSF SSPCON, SSPM3 ; I2C 7-bit slave-mode w/ int is SSPM<3:0>=1110
BSF SSPCON, SSPM2
BSF SSPCON, SSPM1
BCF SSPCON, SSPM0
BANKSEL SSPADD ; Bank 1
MOVLW I2C_ADDR ; Load I2C address into W
MOVWF SSPADD ; Load SSPADD with I2C_ADDR (0xIC random)
RLF SSPADD,F ; Left-shift since SSPSR<7:1> compared
; Interrupt configuration
BANKSEL INTCON ; Bank 0
BSF INTCON, GIE ; Enable global interrupts - SSP interrupt on START
BSF INTCON, PEIE ; Enable peripheral interrupts
BANKSEL PIE1 ; Select PIE register
BSF PIE1, SSPIE ; Enable SSP interrupts in peripheral interrupt reg
;*******************************************************************************
;
; Main program loop
; Cause the sonic module to pulse every 10 microseconds. Wait for a response,
; then save time it took into a register. Repeatedly do this.
;
; Once I2C interrupt is detected, send information via I2C.
;
; TODO only start this loop if START condition detected, stop with I2C STOP.
;
;*******************************************************************************
; Reset to bank 0
BCF STATUS, RP0
BCF STATUS, RP1 ; All of the action is in Bank 0 now
MainLoop
; Send sonic device a 10-microsecond pulse
Pulse
BSF PORTA, RA0 ; Set PORTA<0> output to high
NOP
NOP
NOP
NOP
NOP
NOP
NOP
NOP
NOP
BCF PORTA, RA0 ; Set PORTA<0> output low
; Loop until PORTA<1> goes HI, indicating a response
WaitForResp
BTFSS PORTA, RA1 ; Check PORTA<1>
GOTO WaitForResp ; ... and loop if it's still zero
; Start of our response
CLRF TMR0 ; Start timer from now
WaitUntilLow
BTFSC PORTA, RA1 ; Check PORTA<1> to see if response finished
GOTO WaitUntilLow ; ... loop if still not LOW
; We have a response! Store timer value in a variable
MOVFW TMR0
MOVWF TimerCounts
; Delay for a bit so we aren't constantly polling
CALL Delay
; Return to MainLoop
GOTO MainLoop
;*******************************************************************************
;
; Subroutines
;
;*******************************************************************************
; Loop from Template_for_Ranger_with_interrupts.asm, course lab template
Delay
MOVLW DTime2
MOVWF DInd2
Loop1 MOVLW DTime1
MOVWF DInd1
Subloop1 NOP
NOP
DECFSZ DInd1,F
GOTO Subloop1
DECFSZ DInd2,F
GOTO Loop1
RETURN
Finish
END
The code for the Mega 2560 is written in Results.
With version v5.40 of MPLAB X, Microchip's dedicated compiler/IDE, Microchip upgraded all their binaries to 64-bit. However, since mpasmx
was not updated from 32 bits, and Mac OSx does not support 32-bit applications since Mojave, Mac users will need to either use Windows or a VM.
See Wikipedia's page for addressing structure for more details.